Loop Summarization with Rational Vector Addition Systems
Abstract
This paper presents a technique for computing numerical loop summaries. The method synthesizes a rational vector addition system with resets (\(\mathbb {Q}\)VASR) that simulates the action of an input loop, and then uses the reachability relation of that \(\mathbb {Q}\)VASR to overapproximate the behavior of the loop. The key technical problem solved in this paper is to automatically synthesize a \(\mathbb {Q}\)VASR that is a best abstraction of a given loop in the sense that (1) it simulates the loop and (2) it is simulated by any other \(\mathbb {Q}\)VASR that simulates the loop. Since our loop summarization scheme is based on computing the exact reachability relation of a best abstraction of a loop, we can make theoretical guarantees about its behavior. Moreover, we show experimentally that the technique is precise and performant in practice.
1 Introduction
Modern software verification techniques employ a number of heuristics for reasoning about loops. While these heuristics are often effective, they are unpredictable. For example, an abstract interpreter may fail to find the most precise invariant expressible in the language of its abstract domain due to imprecise widening, or a softwaremodel checker might fail to terminate because it generates interpolants that are insufficiently general. This paper presents a loop summarization technique that is capable of generating loop invariants in an expressive and decidable language and provides theoretical guarantees about invariant quality.
The key idea behind our technique is to leverage reachability results of vector addition systems (VAS) for invariant generation. Vector addition systems are a class of infinitestate transition systems with decidable reachability, classically used as a model of parallel systems [12]. We consider a variation of VAS, rational VAS with resets (\(\mathbb {Q}\)VASR), wherein there is a finite number of rationaltyped variables and a finite set of transitions that simultaneously update each variable in the system by either adding a constant value or (re)setting the variable to a constant value. Our interest in \(\mathbb {Q}\)VASRs stems from the fact that there is (polytime) procedure to compute a linear arithmetic formula that represents a \(\mathbb {Q}\)VASR’s reachability relation [8].
Since the reachability relation of a \(\mathbb {Q}\)VASR is computable, the dynamics of \(\mathbb {Q}\)VASR can be analyzed without relying on heuristic techniques. However, there is a gap between \(\mathbb {Q}\)VASR and the loops that we are interested in summarizing. The latter typically use a rich set of operations (memory manipulation, conditionals, nonconstant increments, nonlinear arithmetic, etc) and cannot be analyzed precisely. We bridge the gap with a procedure that, for any loop, synthesizes a \(\mathbb {Q}\)VASR that simulates it. The reachability relation of the \(\mathbb {Q}\)VASR can then be used to overapproximate the behavior of the loop. Moreover, we prove that if a loop is expressed in linear rational arithmetic (LRA), then our procedure synthesizes a best \(\mathbb {Q}\)VASR abstraction, in the sense that it simulates any other \(\mathbb {Q}\)VASR that simulates the loop. That is, imprecision in the analysis is due to inherent limitations of the \(\mathbb {Q}\)VASR model, rather heuristic algorithmic choices.
One limitation of the model is that \(\mathbb {Q}\)VASRs overapproximate multipath loops by treating the choice between paths as nondeterministic. We show that \(\mathbb {Q}\)VASRS, \(\mathbb {Q}\)VASR extended with control states, can be used to improve our invariant generation scheme by encoding control flow information and interpath control dependencies that are lost in the \(\mathbb {Q}\)VASR abstraction. We give an algorithm for synthesizing a \(\mathbb {Q}\)VASRS abstraction of a given loop, which (like our \(\mathbb {Q}\)VASR abstraction algorithm) synthesizes best abstractions under certain assumptions.
Finally, we note that our analysis techniques extend to complex control structures (such as nested loops) by employing summarization compositionally (i.e., “bottomup”). For example, our analysis summarizes a nested loop by first summarizing its inner loops, and then uses the summaries to analyze the outer loop. As a result of compositionality, our analysis can be applied to partial programs, is easy to parallelize, and has the potential to scale to large code bases.

We present a procedure to synthesize \(\mathbb {Q}\)VASR abstractions of transition formulas. For transition formulas in linear rational arithmetic, the synthesized \(\mathbb {Q}\)VASR abstraction is a best abstraction.

We present a technique for improving the precision of our analysis by using \(\mathbb {Q}\)VASR with states to capture loop control structure.

We implement the proposed loop summarization techniques and show that their ability to verify user assertions is comparable to software model checkers, while at the same time providing theoretical guarantees of termination and invariant quality.
1.1 Outline
We illustrate the analysis on an integer model of a persistent queue data structure, pictured in Fig. 1. The example consists of two operations (enqueue and dequeue), as well as a test harness (harness) that nondeterministically executes enqueue and dequeue operations. The queue achieves O(1) amortized memory operations (mem_ops) in enqueue and queue by implementing the queue as two lists, front and back (whose lengths are modeled as front_len and back_len, respectively): the sequence of elements in the queue is the front list followed by the reverse of the back list. We will show that the queue functions use O(1) amortized memory operations by finding a summary for harness that implies a linear bound on mem_ops (the number of memory operations in the computation) in terms of nb_ops (the total number of enqueue/dequeue operations executed in some sequence of operations).
\(\begin{array}{l} \begin{bmatrix}v\\ w\\ x\\ y\\ z\end{bmatrix} = \underbrace{\begin{bmatrix} 0 &{} 0 &{} 0 &{} 1 &{} 0\\ 0 &{} 1 &{} 0 &{} 0 &{} 0\\ 0 &{} 3 &{} 1 &{} 0 &{} 0\\ 1 &{} 1 &{} 0 &{} 0 &{} 0\\ 0 &{} 0 &{} 0 &{} 0 &{} 1\\ \end{bmatrix}}_{S_{\texttt {har}}} \begin{bmatrix}{} \texttt {front\_len}\\ \texttt {back\_len}\\ \texttt {mem\_ops}\\ \texttt {size}\\ \texttt {nb\_ops}\end{bmatrix}; i{.}e{.}{,} \left( \begin{array}{ll} &{}\texttt {size} = v\\ \wedge &{}\texttt {back\_len} = w\\ \wedge &{}\texttt {mem\_ops}+3\texttt {back\_len} = x\\ \wedge &{}\texttt {back\_len} + \texttt {front\_len} = y\\ \wedge &{}\texttt {nb\_ops} = z\\ \end{array}\right) . \\ V_{\texttt {har}} = \left\{ \underbrace{\begin{bmatrix}v\\ w\\ x\\ y\\ z\end{bmatrix} \rightarrow \begin{bmatrix}v+1\\ w+1\\ x+4\\ y+1\\ z+1\end{bmatrix}}_{\texttt {enqueue}}, \underbrace{\begin{bmatrix}v\\ w\\ x\\ y\\ z\end{bmatrix} \rightarrow \begin{bmatrix}v1\\ w\\ x+2\\ y1\\ z+1\end{bmatrix}}_{\texttt {dequeue} \text { fast}}, \underbrace{\begin{bmatrix}v\\ w\\ x\\ y\\ z\end{bmatrix} \rightarrow \begin{bmatrix}v1\\ 0\\ x+2\\ y1\\ z+1\end{bmatrix}}_{\texttt {dequeue} \text { slow}} \right\} \end{array}\)
Unlike the dequeue loop, we do not get an exact characterization of the dynamics of each changed variable. In particular, in the slow dequeue path through the loop, the value of front_len, back_len, and mem_ops change by a variable amount. Since back_len is set to 0, its behavior can be captured by a reset. The dynamics of front_len and mem_ops cannot be captured by a \(\mathbb {Q}\)VASR, but (using our dequeue summary) we can observe that the sum of front_len + back_len is decremented by 1, and the sum of mem_ops + 3back_len is incremented by 2.
2 Background
A transition system is a pair \((S, \rightarrow )\) where S is a (potentially infinite) set of states and \(\rightarrow \subseteq S \times S\) is a transition relation. For a transition relation \(\rightarrow \), we use \(\rightarrow ^*\) to denote its reflexive, transitive closure.
A transition formula is a formula \(F(\mathbf {x},\mathbf {x}')\) whose free variables range over \(\mathbf {x} = x_1,...,x_n\) and \(\mathbf {x}' = x_1',...,x_n'\) (we refer to the number n as the dimension of F); these variables designate the state before and after a transition. In the following, we assume that transition formulas are defined over \(\exists \text {LIRA}{}\). For a transition formula \(F(\mathbf {x},\mathbf {x}')\) and vectors of terms \(\mathbf {s}\) and \(\mathbf {t}\), we use \(F(\mathbf {s},\mathbf {t})\) to denote the formula F with each \(x_i\) replaced by \(s_i\) and each \(x_i'\) replaced by \(t_i\). A transition formula \(F(\mathbf {x},\mathbf {x}')\) defines a transition system \((S_{F}, \rightarrow _{F})\), where the state space \(S_{F}\) is \(\mathbb {Q}^{n}\) and which can transition \(\mathbf {u} \rightarrow _{F} \mathbf {v}\) iff \(F(\mathbf {u},\mathbf {v})\) is valid.
For two rational vectors \(\mathbf {a}\) and \(\mathbf {b}\) of the same dimension d, we use \(\mathbf {a} \cdot \mathbf {b}\) to denote the inner product \(\mathbf {a} \cdot \mathbf {b} = \sum _{i=1}^d a_ib_i\) and \(\mathbf {a} * \mathbf {b}\) to denote the pointwise (aka Hadamard) product \((\mathbf {a} * \mathbf {b})_i = a_i b_i\). For any natural number i, we use \(\mathbf {e}_i\) to denote the standard basis vector in the ith direction (i.e., the vector consisting of all zeros except the ith entry, which is 1), where the dimension of \(\mathbf {e}_i\) is understood from context. We use \(I_n\) to denote the \(n \times n\) identity matrix.
Definition 1
A rational vector addition system with resets (\(\mathbb {Q}\) VASR) of dimension d is a finite set \(V{} \subseteq \{0,1\}^d \times \mathbb {Q}^d\) of transformers. Each transformer \((\mathbf {r},\mathbf {a}) \in V{}\) consists of a binary reset vector \(\mathbf {r}\), and a rational addition vector \(\mathbf {a}\), both of dimension d. \(V{}\) defines a transition system \((S_{V{}}, \rightarrow _{V{}})\), where the state space \(S_{V{}}\) is \(\mathbb {Q}^d\) and which can transition \(\mathbf {u} \rightarrow _{V{}} \mathbf {v}\) iff \(\mathbf {v} = \mathbf {r}* \mathbf {u} + \mathbf {a}\) for some \((\mathbf {r},\mathbf {a}) \in V{}\).
Definition 2
A rational vector addition system with resets and states (\(\mathbb {Q}\)VASRS) of dimension d is a pair \(\mathcal {V}= (Q, E)\), where Q is a finite set of control states, and \(E \subseteq Q \times \{0,1\}^d \times \mathbb {Q}^d \times Q\) is a finite set of edges labeled by (ddimensional) transformers. \(\mathcal {V}\) defines a transition system \((S_{\mathcal {V}}, \rightarrow _{\mathcal {V}})\), where the state space \(S_{\mathcal {V}}\) is \(Q \times \mathbb {Q}^n\) and which can transition \((q_1, \mathbf {u}) \rightarrow _{\mathcal {V}} (q_2, \mathbf {v})\) iff there is some edge \((q_1, (\mathbf {r},\mathbf {a}), q_2) \in E\) such that \(\mathbf {v} = \mathbf {r}* \mathbf {u} + \mathbf {a}\).
Our invariant generation scheme is based on the following result, which is a simple consequence of the work of Haase and Halfon:
Theorem 1
([8]). There is a polytime algorithm which, given a ddimensional \(\mathbb {Q}\)VASRS \(\mathcal {V}{} = (Q, E)\), computes an \(\exists \text {LIRA}\) transition formula \(\textit{reach}(\mathcal {V}{})\) such that for all \(\mathbf {u},\mathbf {v} \in \mathbb {Q}^d\), we have \((p, \mathbf {u}) \rightarrow _{\mathcal {V}{}}^* (q, \mathbf {v})\) for some control states \(p, q \in Q\) if and only if \(\mathbf {u} \rightarrow _{\textit{reach}(\mathcal {V}{})} \mathbf {v}\).
Note that \(\mathbb {Q}\)VASR can be realized as \(\mathbb {Q}\)VASRS with a single control state, so this theorem also applies to \(\mathbb {Q}\)VASR.
3 Approximating Loops with Vector Addition Systems
In this section, we describe a method for overapproximating the transitive closure of a transition formula using a \(\mathbb {Q}\)VASR. This procedure immediately extends to computing summaries for programs (including programs with nested loops) using the method outlined in Sect. 1.1.
The core algorithmic problem that we answer in this section is: given a transition formula, how can we synthesize a (best) abstraction of that formula’s dynamics as a \(\mathbb {Q}\)VASR? We begin by formalizing the problem: in particular, we define what it means for a \(\mathbb {Q}\)VASR to simulate a transition formula and what it means for an abstraction to be “best.”
Definition 3
Let \(A = (\mathbb {Q}^n,\rightarrow _A)\) and \(B = (\mathbb {Q}^m,\rightarrow _B)\) be transition systems operating over rational vector spaces. A linear simulation from A to B is a linear transformation \(S : \mathbb {Q}^{m \times n} \) such that for all \(\mathbf {u},\mathbf {v} \in \mathbb {Q}^n\) for which \(\mathbf {u} \rightarrow _A \mathbf {v}\), we have \(S\mathbf {u} \rightarrow _B S\mathbf {v}\). We use \(A \Vdash _S B\) to denote that S is a linear simulation from A to B.
Our task is to synthesize a linear transformation S and a \(\mathbb {Q}\)VASR \(V{}\) such that \({F \Vdash _S V{}}\). We call a pair \((S,V{})\), consisting of a rational matrix \(S \in \mathbb {Q}^{d \times n}\) and a ddimensional \(\mathbb {Q}\)VASR \(V{}\), a \(\mathbb {Q}\)VASR abstraction. We say that n is the concrete dimension of \((S,V{})\) and d is the abstract dimension. If \(F \Vdash _S V{}\), then we say that \((S,V{})\) is a \(\mathbb {Q}\)VASR abstraction of F. A transition formula may have many \(\mathbb {Q}\)VASR abstractions; we are interested in computing a \(\mathbb {Q}\)VASR abstraction \((S,V{})\) that results in the most precise overapproximation of the transitive closure of F. Towards this end, we define a preorder \(\preceq \) on \(\mathbb {Q}\)VASR abstractions, where \((S^1,V{}^1) \preceq (S^2,V{}^2)\) iff there exists a linear transformation \(T \in \mathbb {Q}^{e\times d}\) such that \(V{}^1 \Vdash _T V{}^2\) and \(TS^1 = S^2\) (where d and e are the abstract dimensions of \((S^1,V{}^1)\) and \((S^2,V{}^2)\), respectively). Observe that if \((S^1,V{}^1) \preceq (S^2,V{}^2)\), then \(\textit{reach}(V{}^1)(S^1\mathbf {x},S^1\mathbf {x}') \,\models \, \textit{reach}(V{}^2)(S^2\mathbf {x},S^2\mathbf {x}')\).
Algorithm 1 follows the familiar pattern of an AllSatstyle loop. The algorithm takes as input a transition formula F. It maintains a \(\mathbb {Q}\)VASR abstraction (S, V) and a formula \(\varGamma \), whose models correspond to the transitions of F that are not simulated by (S, V). The idea is to build (S, V) iteratively by sampling transitions from \(\varGamma \), augmenting (S, V) to simulate the sample transition, and then updating \(\varGamma \) accordingly. We initialize (S, V) to be \((I_n, \emptyset )\), the canonical least \(\mathbb {Q}\)VASR abstraction in \(\preceq \) order, and \(\varGamma \) to be F (i.e., \((I_n,\emptyset )\) does not simulate any transitions of F). Each loop iteration proceeds as follows. First, we sample a model M of \(\varGamma \) (i.e., a transition that is allowed by F but not simulated by \((S,V{})\)). We then generalize that transition to a set of transitions by using M to select a cube C of the DNF of F that contains M. Next, we use the procedure described in Sect. 3.1 to compute a \(\mathbb {Q}\)VASR abstraction \(\hat{\alpha }(C)\) that simulates the transitions of C. We then update the \(\mathbb {Q}\)VASR abstraction (S, V) to be the least upper bound of (S, V) and \(\hat{\alpha }(C)\) (w.r.t. \(\preceq \) order) using the procedure described in Sect. 3.2 (line 7). Finally, we block any transition simulated by the least upper bound (including every transition in C) from being sampled again by conjoining \(\lnot \gamma (S,V)\) to \(\varGamma \). The loop terminates when \(\varGamma \) is unsatisfiable, in which case we have that \(F \Vdash _S V{}\). Theorem 2 gives the correctness statement for this algorithm.
Theorem 2
Given a transition formula F, Algorithm 1 computes a simulation S and \(\mathbb {Q}\)VASR V such that \(F \Vdash _S V\). Moreover, if F is in \(\exists \text {LRA}{}\), Algorithm 1 computes a best \(\mathbb {Q}\)VASR abstraction of F.
The proof of this theorem as well as the proofs to all subsequent theorems, lemmas, and propositions are in the extended version of this paper [20].
3.1 Abstracting Conjunctive Transition Formulas
This section shows how to compute a \(\mathbb {Q}\)VASR abstraction for a consistent conjunctive formula. When the input formula is in \(\exists \text {LRA}\), the computed \(\mathbb {Q}\)VASR abstraction will be a best \(\mathbb {Q}\)VASR abstraction of the input formula. The intuition is that, since \(\exists \text {LRA}{}\) is a convex theory, a best \(\mathbb {Q}\)VASR abstraction consists of a single transition. For \(\exists \text {LIRA}{}\) formulas, our procedure produces a \(\mathbb {Q}\)VASR abstract that is not guaranteed to be best, precisely because \(\exists \text {LIRA}{}\) is not convex.
Example 1
Proposition 1
For any consistent, conjunctive transition formula C, \(\hat{\alpha }(C)\) is a \(\mathbb {Q}\)VASR abstraction of C. If C is expressed in \(\exists \text {LRA}\), then \(\hat{\alpha }(C)\) is best.
3.2 Computing Least Upper Bounds
This section shows how to compute least upper bounds w.r.t. the \(\preceq \) order.
By definition of the \(\preceq \) order, if \((S,V{})\) is an upper bound of \((S^1,V{}^1)\) and \((S^2,V{}^2)\), then there must exist matrices \(T^1\) and \(T^2\) such that \(T^1S^1 = S = T^2S^2\), \(V^1 \Vdash _{T^1} V\), and \(V^2 \Vdash _{T^2} V\). As we shall see, if (S, V) is a least upper bound, then it is completely determined by the matrices \(T^1\) and \(T^2\). Thus, we shift our attention to computing simulation matrices \(T^1\) and \(T^2\) that induce a least upper bound.
In view of the desired equation \(T^1S^1 = S = T^2S^2\), let us consider the constraint \(T^1S^1 = T^2S^2\) on two unknown matrices \(T^1\) and \(T^2\). Clearly, we have \(T^1S^1 = T^2S^2\) iff each \((T^1_i,T^2_i)\) belongs to the set \(\mathcal {T} \triangleq \{ (\mathbf {t}^1, \mathbf {t}^2) : \mathbf {t}^1S^1 = \mathbf {t}^2S^2 \}\). Observe that \(\mathcal {T}\) is a vector space, so there is a best solution to the constraint \(T^1S^1 = T^2S^2\): choose \(T^1\) and \(T^2\) so that the set of all row pairs \((T^1_i,T^2_i)\) forms a basis for \(\mathcal {T}\). In the following, we use \(\textit{pushout}(S^1,S^2)\) to denote a function that computes such a best \((T^1, T^2)\).
While \(\textit{pushout}\) gives a best solution to the equation \(T^1S^1 = T^2S^2\), it is not sufficient for the purpose of computing least upper bounds for \(\mathbb {Q}\)VASR abstractions, because \(T^1\) and \(T^2\) may not respect the structure of the \(\mathbb {Q}\)VASR \(V^1\) and \(V^2\) (i.e., there may be no \(\mathbb {Q}\)VASR V such that \(V^1 \Vdash _{T^1} V\) and \(V^2 \Vdash _{T^2} V\)). Thus, we must further constrain our problem by requiring that \(T^1\) and \(T^2\) are coherent with respect to \(V^1\) and \(V^2\) (respectively).
Definition 4
Let \(V\) be a ddimensional \(\mathbb {Q}\)VASR. We say that \(i,j \in \{1,...,d\}\) are coherent dimensions of \(V\) if for all transitions \((\mathbf {r},\mathbf {a}) \in V{}\) we have \(r_i = r_j\) (i.e., every transition of \(V{}\) that resets i also resets j and vice versa). We denote that i and j are coherent dimensions of \(V{}\) by writing \(i \equiv _{V{}} j\), and observe that \(\equiv _{V{}}\) forms an equivalence relation on \(\{1,...,d\}\). We refer to the equivalence classes of \(\equiv _{V{}}\) as the coherence classes of \(V{}\).
A matrix \(T \in \mathbb {Q}^{e \times d}\) is coherent with respect to \(V{}\) if and only if each of its rows have nonzero values only in the dimensions corresponding to a single coherence class of \(V{}\).
For any ddimensional \(\mathbb {Q}\)VASR \(V{}\) and coherence class \(C = \{c_1,...,c_k\}\) of \(V{}\), define \(\varPi _{C}\) to be the \(k \times d\) dimensional matrix whose rows are \(\mathbf {e}_{c_1}, ..., \mathbf {e}_{c_k}\). Intuitively, \(\varPi _{C}\) is a projection onto the set of dimensions in C.
Coherence is a necessary and sufficient condition for linear simulations between \(\mathbb {Q}\)VASR in a sense described in Lemmas 1 and 2.
Lemma 1
Let \(V{}^1\) and \(V{}^2\) be \(\mathbb {Q}\)VASR (of dimension d and e, respectively), and let \(T \in \mathbb {Q}^{e \times d}\) be a matrix such that \(V{}^1 \Vdash _T V{}^2\). Then T must be coherent with respect to \(V{}^1\).
Lemma 2
Let \(V{}\) be a ddimensional \(\mathbb {Q}\)VASR and let \(T \in \mathbb {Q}^{e \times d}\) be a matrix that is coherent with respect to \(V{}\) and has no zero rows. Then the transition relation of \(\textit{image}(V{},T)\) is the image of \(V{}\)’s transition relation under T (i.e., \(\rightarrow _{\textit{image}(V{},T)}\) is equal to \(\{(T\mathbf {u},T\mathbf {v}) : \mathbf {u} \rightarrow _{V} \mathbf {v}\}\)).
Finally, prior to describing our least upper bound algorithm, we must define a technical condition that is both assumed and preserved by the procedure:
Definition 5
A \(\mathbb {Q}\)VASR abstraction \((S,V{})\) is normal if there is no nonzero vector \(\mathbf {z}\) that is coherent with respect to \(V{}\) such that \(\mathbf {z}S = 0\) (i.e., the rows of S that correspond to any coherence class of \(V{}\) are linearly independent).
Proposition 2
Let \((S^1,V{}^1)\) and \((S^2,V{}^2)\) be normal \(\mathbb {Q}\)VASR abstractions of equal concrete dimension. Then the \(\mathbb {Q}\)VASR abstraction \((S,V{})\) computed by Algorithm 2 is normal and is a least upper bound of \((S^1,V{}^2)\) and \((S^2,V{}^2)\).
4 Control Flow and \(\mathbb {Q}\)VASRS
We begin with an example that demonstrates the precision gained by \(\mathbb {Q}\)VASRS. The loop in Fig. 2a oscillates between (1) incrementing variable i by 1 and (2) incrementing both variables i and x by 1. Suppose that we wish to prove that, starting with the configuration \(x = 0 \wedge i = 1\), the loop maintains the invariant that \(2x \le i\). The (best) \(\mathbb {Q}\)VASR abstraction of the loop, pictured in Fig. 2b, overapproximates the control flow of the loop by treating the conditional branch in the loop as a nondeterministic branch. This overapproximation may violate the invariant \(2x \le i\) by repeatedly executing the path where both variables are incremented. On the other hand, the \(\mathbb {Q}\)VASRS abstraction of the loop pictured in Fig. 2c captures the understanding that the loop must oscillate between the two paths. The loop summary obtained from the reachability relation of this \(\mathbb {Q}\)VASRS is powerful enough to prove the invariant \(2x \le i\) holds (under the precondition \(x = 0 \wedge i = 1\)).
4.1 Technical Details
In the following, we give a method for overapproximating the transitive closure of a transition formula \(F(\mathbf {x},\mathbf {x}')\) using a \(\mathbb {Q}\)VASRS. We start by defining predicate \(\mathbb {Q}\)VASRS, a variation of \(\mathbb {Q}\)VASRS with control states that correspond to disjoint state predicates (where the states intuitively belong to the transition formula F rather than the \(\mathbb {Q}\)VASRS itself). We extend linear simulations and best abstractions to predicate \(\mathbb {Q}\)VASRS, and give an algorithm for synthesizing best predicate \(\mathbb {Q}\)VASRS abstractions (for a given set of predicates). Finally, we give an endtoend algorithm for overapproximating the transitive closure of a transition formula.
Definition 6
A predicate \(\mathbb {Q}\) VASRS over \(\mathbf {x}\) is a \(\mathbb {Q}\)VASRS \(\mathcal {V}{} = (P, E)\), such that each control state is a predicate over the variables \(\mathbf {x}\) and the predicates in P are pairwise inconsistent (for all \(p \ne q \in P\), \(p \wedge q\) is unsatisfiable).

Let \(F(\mathbf {x},\mathbf {x}')\) be an ndimensional transition formula and let \(\mathcal {V}{} = (P, E)\) be an mdimensional \(\mathbb {Q}\)VASRS over \(\mathbf {x}\). We say that a linear transformation \(S : \mathbb {Q}^{m \times n}\) is a linear simulation from F to \(\mathcal {V}{}\) if for all \(\mathbf {u},\mathbf {v} \in \mathbb {Q}^n\) such that \(\mathbf {u} \rightarrow _F \mathbf {v}\), (1) there is a (unique) \(p \in P\) such that \(p(\mathbf {u})\) is valid (2) there is a (unique) \(q \in P\) such that \(q(\mathbf {v})\) is valid, and (3) \((p, S\mathbf {u}) \rightarrow _{\mathcal {V}{}} (q, S\mathbf {v})\).

Let \(\mathcal {V}{}^1 = (P^1, E^1)\) and \(\mathcal {V}{}^2 = (P^2, E^2)\) be predicate \(\mathbb {Q}\)VASRSs over \(\mathbf {x}\) (for some \(\mathbf {x}\)) of dimensions d and e, respectively. We say that a linear transformation \(S : \mathbb {Q}^{e \times d}\) is a linear simulation from \(\mathcal {V}{}^1\) to \(\mathcal {V}{}^2\) if for all \(p^1,q^1 \in P^1\) and for all \(\mathbf {u},\mathbf {v} \in \mathbb {Q}^d\) such that \((p^1, \mathbf {u}) \rightarrow _{\mathcal {V}{}^1} (q^1, \mathbf {v})\), there exists (unique) \(p^2, q^2 \in P^2\) such that (1) \((p^2, S\mathbf {u}) \rightarrow _{\mathcal {V}{}^2} (q^2,S\mathbf {v})\), (2) \(p^1 \,\models \,p^2\), and (3) \(q^1 \,\models \, q^2\).
Algorithm 3 works as follows: first, for each pair of formulas \(p, q \in P\), compute a best \(\mathbb {Q}\)VASR abstraction of the formula \(p(\mathbf {x}) \wedge F(\mathbf {x},\mathbf {x}') \wedge q(\mathbf {x}')\) and call it \((S_{p,q},V{}_{p,q})\). \((S_{p,q},V{}_{p,q})\) overapproximates the transitions of F that begin in a program state satisfying p and end in a program state satisfying q. Second, we compute the least upper bound of all \(\mathbb {Q}\)VASR abstractions \((S_{p,q},V{}_{p,q})\) to get a \(\mathbb {Q}\)VASR abstraction (S, V) for F. As a sideeffect of the least upper bound computation, we obtain a linear simulation \(T_{p,q}\) from \((S_{p,q},V_{p,q})\) to (S, V) for each p, q. A best \(\mathbb {Q}\)VASRS abstraction of \(F(\mathbf {x},\mathbf {x}')\) with control states P has S as its simulation matrix and has the image of \(V_{p,q}\) under \(T_{p,q}\) as the edges from p to q.
Proposition 3
Given an transition formula \(F(\mathbf {x},\mathbf {x}')\) and control states P over \(\mathbf {x}\), Algorithm 3 computes the best predicate \(\mathbb {Q}\)VASRS abstraction of F with control states P.
Precision Improvement. The abstractVASRS algorithm uses predicates to infer the control structure of a \(\mathbb {Q}\)VASRS, but after computing the \(\mathbb {Q}\)VASRS abstraction, iterVASRS makes no further use of the predicates (i.e., the predicates are irrelevant in the computation of \(\textit{reach}(\mathcal {V})\)). Predicates can be used to improve iterVASRS as follows: the reachability relation of a \(\mathbb {Q}\)VASRS is expressed by a formula that uses auxiliary variables to represent the state at which the computation begins and ends [8]. These variables can be used to encode that the prestate of the transitive closure must satisfy the predicate corresponding to the begin state and the poststate must satisfy the predicate corresponding to the end state. As an example, consider the Fig. 2 and suppose that we wish to prove the invariant \(x \le 2i\) under the precondition \(i = 0 \wedge x = 0\). While this invariant holds, we cannot prove it because there is counter example if the computation begins at \(i\%2 == 1\). By applying the above improvement, we can prove that the computation must begin at \(i\%2 == 0\), and the invariant is verified.
5 Evaluation

Are \(\mathbb {Q}\)VASR sufficiently expressive to be able to generate accurate loop summaries?

Does the \(\mathbb {Q}\)VASRS technique improve upon the precision of \(\mathbb {Q}\)VASR?

Are the \(\mathbb {Q}\)VASR/\(\mathbb {Q}\)VASRS loop summarization algorithms performant?
We implemented our loop summarization procedure and the compositional wholeprogram summarization technique described in Sect. 1.1. We ran on a suite of 165 benchmarks, drawn from the C4B [2] and HOLA [4] suites, as well as the safe, integeronly benchmarks in the loops category of SVComp 2019 [22]. We ran each benchmark with a timeout of 5 min, and recorded how many benchmarks were proved safe by our \(\mathbb {Q}\)VASRbased technique and our \(\mathbb {Q}\)VASRSbased technique. For context, we also compare with CRA [14] (a related loop summarization technique), as well as SeaHorn [7] and UltimateAutomizer [9] (stateoftheart software model checkers). The results are shown in Fig. 3.
The number of assertions proved correct using \(\mathbb {Q}\)VASR is comparable to both SeaHorn and UltimateAutomizer, demonstrating that \(\mathbb {Q}\)VASR can indeed model interesting loop phenomena. \(\mathbb {Q}\)VASRSbased summarization significantly improves precision, proving the correctness of 93% of assertions in the svcomp suite, and more than any other tool in total. Note that the most precise tool for each suite is not strictly better than each of the other tools; in particular, there is only a single program in the HOLA suite that neither \(\mathbb {Q}\)VASRS nor CRA can prove safe.
6 Related Work
Compositional Analysis. Our analysis follows the same highlevel structure as compositional recurrence analysis (CRA) [5, 14]. Our analysis differs from CRA in the way that it summarizes loops: we compute loop summaries by overapproximating loops with vector addition systems and computing reachability relations, whereas CRA computes loop summaries by extracting recurrence relations and computing closed forms. The advantage of our approach is that is that we can use \(\mathbb {Q}\)VASR to accurately model multipath loops and can make theoretical guarantees about the precision of our analysis; the advantage of CRA is its ability to generate nonlinear invariants.
Vector Addition Systems. Our invariant generation method draws upon Haase and Halfon’s polytime procedure for computing the reachability relation of integer vector addition systems with states and resets [8]. Generalization from the integer case to the rational case is straightforward. Continuous Petri nets [3] are a related generalization of vector addition systems, where time is taken to be continuous (\(\mathbb {Q}\)VASR, in contrast, have rational state spaces but discrete time). Reachability for continuous Petri nets is computable polytime [6] and definable in \(\exists \text {LRA}{}\) [1].
Sinn et al. present a technique for resource bound analysis that is based on modeling programs by lossy vector addition system with states [21]. Sinn et al. model programs using vector addition systems with states over the natural numbers, which enables them to use termination bounds for VASS to compute upper bounds on resource usage. In contrast, we use VASS with resets over the rationals, which (in contrast to VASS over \(\mathbb {N}\)) have a \(\exists \text {LIRA}{}\)definable reachability relation, enabling us to summarize loops. Moreover, Sinn et al.’s method for extracting VASS models of programs is heuristic, whereas our method gives precision guarantees.
Affine and Polynomial Programs. The problem of polynomial invariant generation has been investigated for various program models that generalize \(\mathbb {Q}\)VASR, including solvable polynomial loops [19], (extended) Psolvable loops [11, 15], and affine programs [10]. Like ours, these techniques are predictable in the sense that they can make theoretical guarantees about invariant quality. The kinds invariants that can be produced using these techniques (conjunctions of polynomial equations) is incomparable with those generated by the method presented in this paper (\(\exists \text {LIRA}{}\) formulas).
Symbolic Abstraction. The main contribution of this paper is a technique for synthesizing the best abstraction of a transition formula expressible in the language of \(\mathbb {Q}\)VASR (with or without states). This is closely related to the symbolic abstraction problem, which computes the best abstraction of a formula within an abstract domain. The problem of computing best abstractions has been undertaken for finiteheight abstract domains [18], template constraint matrices (including intervals and octagons) [16], and polyhedra [5, 24]. Our best abstraction result differs in that (1) it is for a disjunctive domain and (2) the notion of “best” is based on simulation rather than the typical ordertheoretic framework.
Footnotes
References
 1.Blondin, M., Finkel, A., Haase, C., Haddad, S.: Approaching the coverability problem continuously. In: Chechik, M., Raskin, J.F. (eds.) TACAS 2016. LNCS, vol. 9636, pp. 480–496. Springer, Heidelberg (2016). https://doi.org/10.1007/9783662496749_28CrossRefGoogle Scholar
 2.Carbonneaux, Q., Hoffmann, J., Shao, Z.: Compositional certified resource bounds. In: PLDI (2015)Google Scholar
 3.David, R., Alla, H.: Continuous Petri nets. In: Proceedings of 8th European Workshop on Applications and Theory Petri Nets, pp. 275–294 (1987)Google Scholar
 4.Dillig, I., Dillig, T., Li, B., McMillan, K.: Inductive invariant generation via abductive inference. In: OOPSLA (2013)Google Scholar
 5.Farzan, A., Kincaid, Z.: Compositional recurrence analysis. In: FMCAD (2015)Google Scholar
 6.Fraca, E., Haddad, S.: Complexity analysis of continuous Petri nets. Fundam. Inf. 137(1), 1–28 (2015)MathSciNetzbMATHGoogle Scholar
 7.Gurfinkel, A., Kahsai, T., Komuravelli, A., Navas, J.A.: The SeaHorn verification framework. In: Kroening, D., Păsăreanu, C.S. (eds.) CAV 2015. LNCS, vol. 9206, pp. 343–361. Springer, Cham (2015). https://doi.org/10.1007/9783319216904_20CrossRefGoogle Scholar
 8.Haase, C., Halfon, S.: Integer vector addition systems with states. In: Ouaknine, J., Potapov, I., Worrell, J. (eds.) RP 2014. LNCS, vol. 8762, pp. 112–124. Springer, Cham (2014). https://doi.org/10.1007/9783319114392_9CrossRefGoogle Scholar
 9.Heizmann, M., et al.: Ultimate automizer and the search for perfect interpolants. In: Beyer, D., Huisman, M. (eds.) TACAS 2018. LNCS, vol. 10806, pp. 447–451. Springer, Cham (2018). https://doi.org/10.1007/9783319899633_30CrossRefGoogle Scholar
 10.Hrushovski, E., Ouaknine, J., Pouly, A., Worrell, J.: Polynomial invariants for affine programs. In: Logic in Computer Science, pp. 530–539 (2018)Google Scholar
 11.Humenberger, A., Jaroschek, M., Kovács, L.: Invariant Generation for MultiPath Loops with Polynomial Assignments. In: Verification, Model Checking, and Abstract Interpretation. LNCS, vol. 10747, pp. 226–246. Springer, Cham (2018). https://doi.org/10.1007/9783319737218_11CrossRefGoogle Scholar
 12.Karp, R.M., Miller, R.E.: Parallel program schemata. J. Comput. Syst. Sci. 3(2), 147–195 (1969)MathSciNetCrossRefGoogle Scholar
 13.Kincaid, Z., Breck, J., Forouhi Boroujeni, A., Reps, T.: Compositional recurrence analysis revisited. In: PLDI (2017)Google Scholar
 14.Kincaid, Z., Cyphert, J., Breck, J., Reps, T.: Nonlinear reasoning for invariant synthesis. PACMPL 2(POPL), 1–33 (2018)Google Scholar
 15.Kovács, L.: Reasoning algebraically about Psolvable loops. In: Ramakrishnan, C.R., Rehof, J. (eds.) TACAS 2008. LNCS, vol. 4963, pp. 249–264. Springer, Heidelberg (2008). https://doi.org/10.1007/9783540788003_18CrossRefGoogle Scholar
 16.Li, Y., Albarghouthi, A., Kincaid, Z., Gurfinkel, A., Chechik, M.: Symbolic optimization with SMT solvers. In: POPL, pp. 607–618 (2014)Google Scholar
 17.Monniaux, D.: A quantifier elimination algorithm for linear real arithmetic. In: Cervesato, I., Veith, H., Voronkov, A. (eds.) LPAR 2008. LNCS (LNAI), vol. 5330, pp. 243–257. Springer, Heidelberg (2008). https://doi.org/10.1007/9783540894391_18CrossRefzbMATHGoogle Scholar
 18.Reps, T., Sagiv, M., Yorsh, G.: Symbolic implementation of the best transformer. In: Steffen, B., Levi, G. (eds.) VMCAI 2004. LNCS, vol. 2937, pp. 252–266. Springer, Heidelberg (2004). https://doi.org/10.1007/9783540246220_21CrossRefzbMATHGoogle Scholar
 19.RodríguezCarbonell, E., Kapur, D.: Automatic generation of polynomial loop invariants: algebraic foundations. In: ISSAC, pp. 266–273 (2004)Google Scholar
 20.Silverman, J., Kincaid, Z.: Loop summarization with rational vector addition systems (extended version). arXiv eprints. arXiv:1905.06495, May 2019
 21.Sinn, M., Zuleger, F., Veith, H.: A simple and scalable static analysis for bound analysis and amortized complexity analysis. In: Biere, A., Bloem, R. (eds.) CAV 2014. LNCS, vol. 8559, pp. 745–761. Springer, Cham (2014). https://doi.org/10.1007/9783319088679_50CrossRefGoogle Scholar
 22.8th International Competition on Software Verification (SVCOMP 2019) (2019). https://svcomp.sosylab.org/2019/
 23.Tarjan, R.E.: A unified approach to path problems. J. ACM 28(3), 577–593 (1981)MathSciNetCrossRefGoogle Scholar
 24.Thakur, A., Reps, T.: A method for symbolic computation of abstract operations. In: Madhusudan, P., Seshia, S.A. (eds.) CAV 2012. LNCS, vol. 7358, pp. 174–192. Springer, Heidelberg (2012). https://doi.org/10.1007/9783642314247_17CrossRefGoogle Scholar
Copyright information
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.