Revisiting Enumerative Instantiation
Abstract
Formal methods applications often rely on SMT solvers to automatically discharge proof obligations. SMT solvers handle quantified formulas using incomplete heuristic techniques like Ematching, and often resort to modelbased quantifier instantiation (MBQI) when these techniques fail. This paper revisits enumerative instantiation, a technique that considers instantiations based on exhaustive enumeration of ground terms. Although simple, we argue that enumerative instantiation can supplement other instantiation techniques and be a viable alternative to MBQI for valid proof obligations. We first present a stronger Herbrand Theorem, better suited as a basis for the instantiation loop used in SMT solvers; it furthermore requires considering less instances than classical Herbrand instantiation. Based on this result, we present different strategies for combining enumerative instantiation with other instantiation techniques in an effective way. The experimental evaluation shows that the implementation of these new techniques in the SMT solver CVC4 leads to significant improvements in several benchmark libraries, including many stemming from verification efforts.
1 Introduction
In many formal methods applications, such as verification, it is common to represent proof obligations in terms of the Satisfiability Modulo Theories (SMT) problem. SMT solvers have thus become popular backends for such applications. They have been primarily designed to be decision procedures for quantifierfree problems, on which they are highly efficient and capable of handling large formulas over background theories. Quantified formulas are generally handled with instantiation techniques that are often incomplete, even on decidable or semidecidable fragments. Heavily relying on incomplete heuristics however leads to instability and unpredictability on the solver’s behavior, which is undesirable for the tools relying on them. To address these issues some systems use modelbased instantiation (MBQI) [19], a complete technique for firstorder logic with equality and for several restricted fragments containing theories, which can be used as a fallback strategy to the incomplete techniques.
In this paper we introduce a novel enumerative instantiation technique which can serve as a simpler alternative to modelbased instantiation. Similar to MBQI, our technique can be used as a secondary strategy when incomplete techniques fail. Our experiments show that a careful implementation of this technique in the stateoftheart SMT solver CVC4 leads to noticeable gains in performance on unsatisfiable problems.
Background. Some of the earliest tools for theorem proving in firstorder logic come from the work by Skolem and Herbrand. The Herbrand Theorem states that if a closed formula in Skolem normal form, i.e. a prenex formula without existential quantifiers, is unsatisfiable, then there is an unsatisfiable finite conjunction of Herbrand instances of the formula, that is, instances on terms from the Herbrand universe, i.e. the set of all possible wellsorted ground terms in the formula’s signature. The first theorem provers for firstorder logic to be implemented based on Herbrand’s theorem employed a completely unguided search on the Herbrand Universe (e.g. Gilmore [20] and Davis et al. [11] early efforts). Such systems were only capable of dealing with very simple formulas and were soon put aside. Techniques which would only generate Herbrand instances when needed were first introduced by Prawitz [24] and later refined by Davis and Putnam [12], culminating in the resolution calculus introduced by Robinson [30]. The most successful techniques for handling pure firstorder logic have been based on resolution and ordering criteria [3]. More recently, techniques based on instantiation have shown promise for firstorder logic as well [13, 17, 28]. Inspired by early work on the subject, this paper revisits whether modern implementations of the latter class of techniques can benefit from enumerative instantiation.
Outline. We first give preliminaries in Sect. 2. Then, we introduce a stronger Herbrand Theorem as the basis for making enumerative instantiation practical so that it can be used in modern systems in Sect. 3. We formalize the different instantiation strategies used by stateoftheart SMT solvers, discuss their strengths and weaknesses, and present a schematization of how to combine such strategies in Sect. 4, with a focus on a new strategy for enumerative instantiation. An extensive experimental evaluation of enumerative instantiation as implemented in CVC4 is presented in Sect. 5.
2 Preliminaries
We work in the context of manysorted firstorder logic with equality (see e.g. [16]) and assume the reader is familiar with the notions of signature, term, (quantified and ground) formula, atom, literal, free and bound variable, and substitution.
We consider signatures \(\varSigma \) containing a Bool sort and constants \(\top ,\,\bot \) and a family of predicate symbols \(({\approx } : \tau \times \tau \rightarrow \mathsf {Bool})\) interpreted as equality for each sort \(\tau \). Without loss of generality, we assume \(\approx \) is the only predicate in \(\varSigma \). We use \(=\) for syntactic equality. The set of all terms occurring in a formula \(\varphi \) (resp. term t) is denoted by Open image in new window (resp. Open image in new window ). We write \(\bar{t}\) for the sequence of terms Open image in new window for an unspecified \(n \in \mathbb {N}^+\) that is either irrelevant or deducible from the context.
A substitution \(\sigma \) maps variables to terms and its domain, \(\mathsf {dom}( \sigma )\), is finite. We write \(\mathsf {ran}( \sigma )\) to denote its range. Throughout the paper, conjunctions may be written as sets or tuples, and viceversa, whenever convenient and unambiguous. All definitions are assumed to be lifted in the expected way from formulas into sets or tuples of formulas.
InstantiationBased SMT Solvers
Quantifiers in formulas are generally handled by SMT solvers through instantiationbased techniques, which capitalize on their capability to handle large ground formulas. In this approach, an input formula \(\psi \) is given to the ground SMT solver, which will abstract all atoms and quantified formulas and treat them as if they were propositional variables. The solver for ground formulas will provide an assignment \(\mathsf {E}\cup \mathsf {Q}\), where \(\mathsf {E}\) is a set of ground literals and \(\mathsf {Q}\) is a set of quantified formulas appearing in \(\psi \), such that \(\mathsf {E}\cup \mathsf {Q}\) propositionally entails \(\psi \). We assume that all quantified formulas in \(\psi \) are of the form \(\forall \bar{x}.\>\varphi \) with \(\varphi \) quantifierfree. This can be achieved by prenex form transformation and Skolemization. The instantiation module of the solver will then generate new ground formulas of the form \(\forall \bar{x}.\>\varphi \Rightarrow \varphi \sigma \) where \(\forall \bar{x}.\>\varphi \) is a quantified formula in \(\mathsf {Q}\) and \(\sigma \) is a substitution from the variables in \(\varphi \) to ground terms. These instances will be added conjunctively to the input of the ground solver, hence refining its knowledge of the quantified formulas. The ground solver may then provide another assignment \(\mathsf {E}'\cup \mathsf {Q}'\), where this is a set that entails both \(\varphi \) and the newly added instances. This new assignment might either be the previous one, augmented by new ground literals coming from the new instances, or if the previous \(\mathsf {E}\) has been refuted by the new instances, a completely different set. On the other hand, the process may terminate if the newly added instances suffice to prove the unsatisfiability of the original formula. We will refer to the game between the ground solver that provides assignments for the abstraction of the formula and the instantiation module that provides instances added conjunctively to the formula, as the instantiation loop of the SMT solver (see Fig. 1).
3 Herbrand Theorem and Beyond
The Herbrand Theorem (see e.g. [16]) for pure firstorder logic with equality^{1} provides a refutationally complete procedure to check the satisfiability of a formula \(\psi \), or more specifically of a set of literals and quantifiers \(\mathsf {E}\cup \mathsf {Q}\). Indeed, \(\mathsf {E}\cup \mathsf {Q}\) is satisfiable if and only if \(\mathsf {E}\cup \mathsf {Q}_g\) is satisfiable, where \(\mathsf {Q}_g\) is the set of all (Herbrand) instances one can build from the quantifiers in \(\mathsf {Q}\) by instantiation with the Herbrand universe, i.e. all the possible wellsorted terms built on the signature used in \(\mathsf {E}\cup \mathsf {Q}\). Based on this, an instantiation module has a simple refutationally complete strategy for pure firstorder logic with equality: it suffices to enumerate Herbrand instances. The major drawback of this strategy is that the Herbrand universe is large. For instance, as soon as there is a function with the range sort also used as an argument, the Herbrand universe is infinite.
Fortunately, a stronger variant of the Herbrand Theorem holds. Using this variant, the instantiation module does not need to consider all possible wellsorted terms (i.e. the full Herbrand universe), but only the terms already available in \(\mathsf {E}\cup \mathsf {Q}\), and those subsequently generated.
Theorem 1

for some number n, the finite set of formulas \(\mathsf {E}\cup \bigcup _{i=1}^{n}\mathsf {Q}_i\) is unsatisfiable;
Proof
All proofs for this section are included in [26]. \(\square \)
The above theorem is stronger than the classical Herbrand theorem in the sense that the set of instances considered above is smaller (or equal) than the set of instances considered in the classical Herbrand theorem. As a trivial example, if a function f appears only in \(\mathsf {E}\cup \mathsf {Q}\) in ground terms, no new applications of f are considered. The theorem does not consider all arbitrary terms from the signature, but only those that are generated by the successive instantiations with only already available ground terms. Note the theorem holds for pure firstorder logic with equality, and in any theory that preserves the compactness property. It is also necessary however to consider the axioms of the theory for the generation of new terms, that might lead to other instances.
In the BernaysSchönfinkelRamsey fragment of firstorder logic (also know as the EPR class) formulas do not contain non constant function symbols, therefore the Herbrand universe of any formula is a finite set. Since the above sets of terms are a subset of the Herbrand universe, the enumeration will always terminate, even when the formula is satisfiable. Therefore, the resulting ground problem is decidable, and the above method comprises a decision procedure for this fragment, just like some variant of modelbased quantifier instantiation.
Theorem 1 implies that an instantiation module only has to consider terms occurring within assignments, and not all possible terms. To show refutational completeness (termination on unsatisfiable input) and model soundness (termination without declaring unsatisfiability implies that the input is satisfiable), it is however necessary to account for the successive assignments produced by the ground SMT solver and the consecutive generation of instances. This is achieved using the following lemma.
Lemma 1

\(\mathsf {E}_0 = \mathsf {E}\), \(\mathsf {E}_{i+1}\,\models \,\mathsf {E}_i \cup \mathsf {Q}_{i}\);
The above lemma has two direct consequences on the instantiation loop of SMT solvers, where instances are generated from the set of available terms in the ground assignment provided by the ground SMT solver. The following two corollaries state the model soundness and the refutational completeness of the instantiation loop respectively.
Corollary 1
Given a formula \(\psi \), if there exists a satisfiable set of literals \(\mathsf {E}\) and a set of quantified clauses \(\mathsf {Q}\) such that \(\mathsf {E}\cup \mathsf {Q}\,\models \,\psi \) and the instantiation module of the SMT solver cannot generate any new instance, i.e. \(\mathsf {E}\) already entails all instances of \(\mathsf {Q}\) for substitutions built with terms Open image in new window , then \(\psi \) is satisfiable.
Proof
A formal statement of the corollary and a proof is available in [26]. \(\square \)
Corollary 2
Given an unsatisfiable formula, if the generation of instances is fair the instantiation loop of the SMT solver terminates.
Proof
A formal statement of the corollary and a proof is available in [26]. \(\square \)
4 Quantifier Instantiation in CDCL(\(\mathscr {T}\))
This section overviews recent techniques used by SMT solvers for quantifier instantiation, and comments on their relative strengths and weaknesses. We will focus on enumerative quantifier instantiation, a technique which has received little attention in recent work, but has several compelling advantages with respect to current techniques.
Definition 1
 1.
A \(\mathscr {T}\)satisfiable set of ground literals \(\mathsf {E}\), and
 2.
A quantified formula \(\forall \bar{x}.\>\varphi \).
It outputs a set of substitutions \(\{ \sigma _1, \ldots , \sigma _n \}\) where \(\mathsf {dom}( \sigma _i ) = \bar{x}\) for each \(i = 1,\ldots ,n\).
Figure 2 gives four instantiation strategies used by modern SMT solvers, each that have the interface given in Definition 1. The first three have been described in detail in previous works (see [25] for a recent overview). We briefly review these techniques in this section. The fourth, enumerative quantifier instantiation, is the subject of this paper.
Conflictbased instantiation (\(\mathbf{c}\)) was introduced in [28] as a technique for improving the performance of SMT solvers for unsatisfiable problems. In this strategy, we return a substitution \(\sigma \) such that \(\varphi \sigma \) together with \(\mathsf {E}\) is unsatisfiable, We refer to \(\varphi \sigma \) as a conflicting instance (for \(\mathsf {E}\)). Typical implementations of this strategy do not insist that a conflicting instance be returned if one exists, and hence the strategy may choose to return the empty set of substitutions. Recent work [4, 5] gives a strategy for conflictbased instantiation that has refutational completeness guarantees for the empty theory with equality, that is, when a conflict instance exists for a quantified formula in this theory, the strategy is guaranteed to return it.
Ematching instantiation (\(\mathbf{e}\)) is the most commonly used strategy for quantifier instantiation in modern SMT solvers [13, 15, 18]. In this strategy, we first heuristically choose a set of triggers for a quantified formula \(\forall \bar{x}.\> \varphi \), where a trigger is a tuple of terms whose free variables are \(\bar{x}\). In practice, triggers can be selected using userprovided annotations, or selected automatically by the SMT solver. For each trigger \(\bar{t}_i\), we select a set of substitutions \(S_i\) such that for each \(\sigma \) in this set, \(\mathsf {E}\) entails that \(\bar{t}_i \sigma \) is equal to a tuple of ground terms \(g_i\) in \(\mathsf {E}\). We return the union of these sets \(S_i\) for each selected trigger. Ematching instantiation is generally incomplete, but works well in practice for unsatisfiable problems, and hence is a key component of most SMT solvers that support quantified formulas.
Modelbased quantifier instantiation (\(\mathbf{m}\)) was introduced in [19], and has also been used for improving the performance of finite model finding [29]. In this strategy, we first construct a model \(\mathscr {M}\) for the quantifierfree portion of our input \(\mathsf {E}\), where typically the interpretations of functions for values not constrained by \(\mathsf {E}\) are chosen heuristically. Notice that \(\mathscr {M}\) does not necessarily satisfy the quantified formula \(\forall \bar{x}.\> \varphi \). If it does not, we return a single substitution \(\sigma \) for which \(\mathscr {M}\) does not satisfy \(\varphi \sigma \), where typically \(\sigma \) maps variables from \(\bar{x}\) to terms that occur in Open image in new window . With respect to conflictbased and Ematching instantiation, modelbased quantifier instantiation has the advantage that it is model sound: when it returns \(\emptyset \), then \(\mathsf {E}\cup \{ \forall \bar{x}.\> \varphi \}\) is satisfiable.
This paper revisits enumerative quantifier instantiation (\(\mathbf{u}\)) as a viable alternative to modelbased quantifier instantiation. In this strategy, we assume an ordering \(\preceq \) on quantifierfree terms. This ordering is not related to the usual term ordering one generally uses for saturation theorem proving, but rather determines which instance will be generated first. The strategy returns the substitution \(\{ \bar{x} \mapsto \bar{t} \}\), where \(\bar{t}\) is the minimal tuple of terms with respect to \(\preceq \) from Open image in new window such that \(\varphi \{ \bar{x} \mapsto \bar{t} \}\) is not entailed by \(\mathsf {E}\). We refer to this strategy as enumerative instantiation since in the worst case it generates instantiations by enumerating tuples of all terms of the proper sort from \(\mathsf {E}\), according to the ordering \(\preceq \). In practice, the number of instantiations produced by this strategy is kept small by interleaving it with other strategies like \(\mathbf{c}\) or \(\mathbf{e}\), or due to the fact that a small number of instances may already allow the SMT solver to conclude the input is unsatisfiable. Moreover, thanks to the results in Sect. 3, this strategy is refutationally complete and model sound for quantified formulas in the empty theory with equality.
Example 1
Consider the set of ground literals \(\mathsf {E}= \{ \lnot P( a ), \lnot P( b ), P( c ), \lnot R( b ) \}\). For the input \(( \mathsf {E}, \forall x.\>P( x ) \vee R( x ) )\), the strategies in this section will do the following.
 1.
Conflict based: Since \(\mathsf {E},\, P( b ) \vee R( b )\,\models \,\bot \), this strategy will return \(\{ \{ x \mapsto b \} \}\).
 2.
Ematching: This strategy may choose the singleton set of triggers \(\{ ( P( x ) ) \}\). Based on this trigger, since \(\mathsf {E}\,\models \,P( x ) \{ x \mapsto t \} \approx P( t )\) where Open image in new window for \(t = a, b, c\), this strategy may return \(\{ \{ x \mapsto a \},\, \{ x \mapsto b \},\, \{ x \mapsto c \} \}\).
 3.
Modelbased: This strategy will construct a model \(\mathscr {M}\) for \(\mathsf {E}\), where assume that \(P^\mathscr {M}= \lambda x.\> \mathsf {ite}( x \approx c,\, \top ,\, \bot )\) and \(R^\mathscr {M}= \lambda x.\> \bot \). Since \(\mathscr {M}\) does not satisfy \(P( a ) \vee R( a )\), this strategy may return \(\{ \{ x \mapsto a \} \}\).
 4.
Enumerative instantiation: This strategy chooses an ordering on tuples of terms, say the lexicographic extension of \(\preceq \) where \(a \prec b \prec c\). Since \(\mathsf {E}\) does not entail \(P( a ) \vee R( a )\), this strategy returns \(\{ \{ x \mapsto a \} \}\). \(\square \)
In the previous example, clearly \(\{ x \mapsto b \}\) is the most useful substitution, since it leads to an instance \(P( b ) \vee R( b )\) which together with \(\mathsf {E}\) is unsatisfiable. The substitution \(\{ x \mapsto c \}\) is definitely not a useful substitution, since it is already entailed by \(P( c ) \in \mathsf {E}\). The substitution \(\{ x \mapsto a \}\) is potentially useful since it forces the solver to satisfy \(P( a ) \vee R( a )\). Here, we point out that the effect of enumerative instantiation and modelbased instantiation is essentially the same, as both return an instance that is not entailed by \(\mathsf {E}\). However, the substitutions produced by enumerative instantiation often have advantages with respect to modelbased instantiation on unsatisfiable problems.
Example 2
A key observation is that useful instantiations can be obscured by guesses made when constructing models \(\mathscr {M}\). Here, since we decided \(R( a )^\mathscr {M}= \bot \), the substitution \(\{ x \mapsto a \}\) was not considered when applying modelbased instantiation to the second quantified formula, and since \(S( a )^\mathscr {M}= \bot \), the substitution \(\{ x \mapsto a \}\) was not considered when applying it to the third. In implementations of modelbased instantiation, certain values in models are chosen heuristically, leading to this behavior. This is done out of necessity, since determining whether there exists a model that satisfies quantified formulas, even for a fixed context, is a challenging problem.
On the other hand, the range of substitutions considered by enumerative instantiation in the previous example include all terms that correspond to instances that are not entailed by \(\mathsf {E}\). The substitutions it considers are “minimally diverse”, that is, in the previous example they introduce new predicates on term a only, whereas modelbased instantiation introduces new predicates on a, b and c. Reducing the number of new terms introduced by instantiations can have a significant positive impact on performance in practice. Furthermore, enumerative instantiation has the advantage that a term ordering allows finegrained heuristics better suited for unsatisfiable problems, which we comment on in Sect. 4.1.
Example 3
Consider the sets \(\mathsf {E}= \{ a \not \approx b,\, b \not \approx c,\, a \not \approx c \}\) and \(\mathsf {Q}= \{ \forall x.\>P( x ) \}\). For the input \(( \mathsf {E},\, \forall x.\> P( x ) )\), modelbased quantifier instantiation will first construct a model \(\mathscr {M}\) for \(\mathsf {E}\), where assume that \(P^\mathscr {M}= \lambda x.\> \top \). It is easy to see \(\mathscr {M}\,\models \,\varphi \{ x \mapsto t \}\) for Open image in new window , and hence it returns the empty set of substitutions, indicating that \(\mathsf {E}\cup \mathsf {Q}\) is satisfiable. On the other hand, assume enumerative instantiation chooses the lexicographic extension of a term ordering \(\preceq \) where \(a \prec b \prec c\). Since Open image in new window and a is smaller than b and c according to \(\preceq \), \(\mathbf{u}( \mathsf {E},\, P( x ) )\) returns the set containing \(\{ x \mapsto a \}\). Subsequently and for similar reasons, two more iterations of this strategy will be invoked, resulting in the instances P(b) and P(c) before it terminates with the empty set. \(\square \)
In this example, modelbased instantiation was able to terminate on the first iteration, since it guessed the correct interpretation for P, whereas enumerative instantiation considered substitutions mapping x to each ground term a, b, c from \(\mathsf {E}\). For this reason, modelbased instantiation is typically better suited for satisfiable problems.
4.1 Implementing Enumerative Instantiation
We comment on several important details concerning the implementation of enumerative quantifier instantiation in the SMT solver CVC4.
The underlying term ordering is determined dynamically based on the current set of assertions \(\mathsf {E}\). At all times, we maintain a finite list of quantifierfree terms such that we have fixed the ordering \(t_1 \prec \ldots \prec t_n\). Then, if all combinations of instantiations for \(t_1, \ldots , t_n\) are currently entailed by \(\mathsf {E}\), we choose a term Open image in new window that is such that Open image in new window for \(i = 1, \ldots , n\) if one exists, and append it to our ordering so that \(t_n \prec t\). The particular choice of t beyond this criteria is arbitrary. An experimental evaluation of more sophisticated term orderings, such as those inspired by firstorder automated theorem proving [2] is the subject of future work.
Entailment Checks. For a set of ground equalities and disequalities \(\mathsf {E}\), quantified formula \(\forall \bar{x}.\> \varphi \) and substitution \(\{ \bar{x} \mapsto \bar{t} \}\), CVC4 implements a twolayered method for checking whether the entailment \(\mathsf {E}\,\models \,\varphi \{ \bar{x} \mapsto \bar{t} \}\) holds. First, we maintain a cache of instantiations that have already been returned on previous iterations. Hence if \(\mathsf {E}\) satisfies a set of formulas containing \(\varphi \{ \bar{x} \mapsto \bar{s} \}\), where \(\mathsf {E}\,\models \,\bar{t} \approx \bar{s}\), then the entailment clearly holds.
 1.
Replace each constant t in \(\ell \) with \([ t ]\).
 2.
Replace each function term \(f( t_1, \ldots , t_n )\) in \(\ell \) with s if \(( t_1, \ldots , t_n ) \rightarrow s \in \mathscr {I}_{f}\).
 3.
If \(\ell \) is \(t \approx t\), replace it by \(\top \).
 4.
If \(\ell \) is \(t \not \approx s\) and \(t' \not \approx s' \in \mathsf {E}\) where \([ t' ] = t\) and \([ s' ] = s\), replace it by \(\top \).
Then, if the resultant \(\psi \) is \(\top \), then the entailment holds. Although not shown here, the above process is extended in a straightforward way to handle Boolean structure, and also can be extended in the presence of other background theories in a straightforward way by incorporating theoryspecific rewriting steps.
Restricting Enumeration Space. Enumerative instantiation can be refined further by noticing that only a subset of the set of terms Open image in new window will ever be relevant for showing unsatisfiability of a quantified formula. An approach in this spirit was used by Ge and de Moura [19], where decidable fragments were identified by noticing that the relevant domains of quantified formulas in these fragments are guaranteed to be finite. In that work, the relevant domain of a quantified formula Open image in new window is computed based on the terms in \(\mathsf {E}\) and the structure of its body \(\psi \). For example, t is in the relevant domain of function f for all ground terms f(t), the relevant domain of x for a quantified formula containing the term f(x) is equal to the relevant domain of f, and so on. A related approach is to use sort inference [8, 9, 22], to compute more precise sort information and thus decrease the number of possible instantiations.
Example 4
Say \(\mathsf {E}\cup \mathsf {Q}= \{ a \not \approx b, f( a ) \approx c \} \cup \{ \forall x.\>P( f( x ) ) \}\), where a, b, c, x are of sort \(\tau \), f is a unary function \(\tau \rightarrow \tau \), and P is a predicate on \(\tau \). It can be shown that \(\mathsf {E}\cup \mathsf {Q}\) is equivalent to \(\mathsf {E}^s \cup \mathsf {Q}^s =\) \(\{ a_1 \not \approx b_1, f_{12}( a_1 ) \approx c_2 \} \cup \{ P_2( f_{12}( x_1 ) ) \}\), where \(a_1, b_1\), \(x_1\) are of sort \(\tau _1\), \(c_2\) is of sort \(\tau _2\), \(f_{12}\) is of sort \(\tau _1 \rightarrow \tau _2\), and \(P_2\) is a predicate on \(\tau _2\). \(\square \)
Sorts can be inferred in this manner using a linear traversal on the input formula (for details, see for instance Sect. 4 of [22]). This technique narrows the set of terms considered by enumerative instantiation. In the above example, whereas enumerative instantiation for \(\mathsf {E}\cup \mathsf {Q}\) might consider the substitutions \(\{ x \mapsto c \}\) or \(\{ x \mapsto f( c ) \}\), for \(\mathsf {E}^s \cup \mathsf {Q}^s\) it would not consider \(\{ x_1 \mapsto c_2 \}\) since their sorts are different, nor would it consider \(\{ x_1 \mapsto f_{12}( c_2 ) \}\) since \(f_{12}( c_2 )\) is not a wellsorted term. Moreover, the Herbrand universe of an inferred subsort may be finite when the universe of its parent sort is infinite. In the above example, the Herbrand universe of \(\tau _1\) is \(\{ a_1,b_1 \}\) and \(\tau _2\) is \(\{ f_{12}( a_1 ), f_{12}( b_1 ), c_2 \}\), whereas the Herbrand universe of \(\tau \) is infinite.
Compound Strategies. Since the instantiation strategies from this section have their respective strengths and weaknesses, it is valuable to combine them. We consider two ways of combining strategies which we refer as priority instantiation and interleaved instantiation. For base strategies \(\mathbf{s_1}\) and \(\mathbf{s_2}\), priority instantiation (\(\mathbf{s_1};\mathbf{s_2}\)) first invokes \(\mathbf{s_1}\). If this strategy returns a nonempty set of substitutions, it returns that set, otherwise it returns the instances returned by \(\mathbf{s_2}\). On the other hand, interleaved instantiation (\(\mathbf {s_1}\)+\(\mathbf {s_2}\)) returns the union of the substitutions returned by the two strategies.
Enumerative instantiation is the most effective when used as a complement to heuristic strategies. In particular, we will see in the next section that the strategies c;e;u and c;e+u are the most effective strategies for unsatisfiable problems in CVC4.
5 Experiments
We follow the convention in Sect. 4 for identifying configurations based on their instantiation strategy. All configurations of CVC4 use conflictbased instantiation [5, 28] with highest priority, so we omit the prefix “c;” from the names of CVC4 configurations e.g. e+u in fact means c;e+u. Sort inference, as discussed in Sect. 4.1, is also used by all configurations of CVC4.
5.1 Impact of Enumerative Instantiation in CVC4
In this section, we highlight the impact of enumerative instantiation in CVC4 for unsatisfiable benchmarks. Where applicable, we contrast the difference in the impact of enumerative instantiation and modelbased instantiation on the performance of CVC4 on unsatisfiable benchmarks.^{4}
The comparison of various instantiation strategies supported by CVC4 is summarized in Fig. 3. In the table, each row is dedicated to a library and logic. SMTLIB is shown in more granularity than TPTP to highlight comparisons of individual strategies. The first column identifies the subset and the second shows its total number of benchmarks. The next seven columns show the number of benchmarks found to be unsatisfiable by each configuration. The last three columns show the results of virtual portfolio solvers, with uport combining e, u, e;u, and e+u; and mport combining e, m, e;m, and e+m; while port combines all seven configurations.
First, we can see that u outperforms m, as it solves \(3\,043\) more benchmarks overall. While this is not close to the performance of Ematching (e), it should be noted that u is highly orthogonal to e, solving \(1\,737\) benchmarks that could not be solved by e^{5}. Combining e with either u or m, using either priority or interleaving instantiation, leads to significant gains in performance. Overall the best configuration is e+u, that is, the interleaving of enumerative instantiation and Ematching, which solves \(20\,535\) benchmarks, that is, 253 more than its counterpart e+m interleaving modelbased instantiation and Ematching, and \(1\,295\) more than Ematching alone. In the UFLIA logic, the enumerative techniques are specially effective in comparison with the modelbased ones. In particular, they enable CVC4 to solve previously intractable problems, e.g. the family “sexpr” with 32 problems. These are notoriously hard problems involving the verification of C# programs using Spec# [6]. Z3 can solve 31 of them thanks to its advanced optimizations of Ematching [13]. CVC4 previously could solve at most 16 using techniques combining e and m, but u alone could solve 27, and all of 32 are solved by e+u. Another example is the family “vcchavoc” in UFNIA, stemming from the verification of concurrent C with VCC [10]. The strategy e+u solves 940 out of 984 problems, outperforming e and its combinations with m, which solve at most 860 problems^{6}.
5.2 Comparison Against Other SMT Solvers
In this section, we compare our implementation of enumerative instantiation in CVC4 against another stateoftheart SMT solver: Z3 [14] (version 4.5.1) which, like CVC4, also relies on Ematching instantiation for handling unsatisfiable problems. Before making the comparison, we first summarize the main differences between Z3 and CVC4 here. Z3 uses several optimizations for Ematching that are not implemented in CVC4, including the use of code trees and techniques for applying instantiation incrementally during the CDCL(\(\mathscr {T}\)) search (see Sect. 5 of [13]). It also implements techniques for removing previously considered instantiations from its set of known clauses (see Sect. 7 of [13]). The main advantage of CVC4 with respect to Z3 is its use of conflictbased instantiation \(\mathbf{c}\) [28], which is enabled by default in all strategies we considered. It also supports interleaved instantiation strategies as described in Sect. 4.1, whereas Z3 does not. In addition to these differences, Z3 implements modelbased instantiation m as described in [19], whereas CVC4 implements modelbased instantiation as described in [29]. Finally, CVC4 implements enumerative instantiation as described in this paper, which we compare as an alternative to these implementations.
Figure 4 summarizes the performance of Z3 on our benchmark set. First, like CVC4, using modelbased instantiation to complement Ematching leads to significant gains in Z3, as z3 e;m solves a total of 1731 more benchmarks than solved by Ematching alone z3 e. In comparison with CVC4, the configuration z3 e outperforms e in the logics with nonlinear arithmetic and other theories, while e is better in the others. Finally, Z3’s implementation of modelbased quantifier instantiation by itself z3 m is not effective for unsatisfiable benchmarks, solving only 8951 overall.
To further compare Z3 and CVC4, the third column from the left is the number of benchmarks solved by CVC4’s Ematching strategy (e), which we gave in Fig. 3. The second to last column uporti gives the number of benchmarks solved by at least one of u, e, or e;u in CVC4, where we intentionally omit the interleaved strategy e+u, since Z3 does not support a similar strategy. The column mporti is computed similarly. We compare these with the fifth column, z3 mporti, i.e. the number of benchmarks solved by either z3 m, z3 e or z3 e;m. A comparison of these is given in the cactus plot of Fig. 4. We can see that due to Z3’s highly optimized implementations, z3 mporti solves the highest number of problems in less than one second (around 13000), whereas the portfolio strategies of CVC4 solve more for larger timeouts. Overall, the best portfolio strategy is enumerative instantiation in CVC4, which solves a total of 21305 unsatisfiable benchmarks overall, which is 1965 more benchmarks than z3 mporti, and 470 more benchmarks than mporti. We thus conclude that the use of enumerative instantiation when paired with Ematching and conflictbased instantiation in CVC4 improves the stateoftheart of instantiationbased SMT solvers for unsatisfiable benchmarks.
Comparison with Automated Theorem Provers. Automated theorem provers like Vampire [23] and E [31] use substantially different techniques based on superposition, hence we do not provide an extensive comparison here. However, we do remark that the gains provided by enumerative instantiation were one of the main reasons for CVC4 being more competitive in the 2017 CASC competition of automatic theorem provers [34]. CVC4 placed third in the category with unsatisfiable problems on the empty theory, as in previous years, behind superpositionbased theorem provers Vampire and E, which implement complete strategies. There was, however, a \(23\%\) reduction in the number of problems that E solves and CVC4 does not, w.r.t. the previous competition, reducing the gap between the two systems.
Satisfiable Benchmarks. For satisfiable benchmarks^{8}, m solves 1350 benchmarks across all theories. As expected, this is much higher than the number solved by u, which solves 510 benchmarks, all from the empty theory. Nevertheless, there are 13 satisfiable problems solved by u and not by m, which shows that enumerative instantiation has some orthogonality on satisfiable benchmarks as well. We conclude that enumeration not only has superior performance to MBQI on unsatisfiable benchmarks, but also can be an alternative for satisfiable benchmarks in the empty theory.
5.3 Artifact
We have produced an artifact [27] to reproduce the experimental results presented in this paper. The artifact contains the binaries of the SMT solvers CVC4 and Z3, the benchmarks on which they were evaluated, and the running scripts for each configuration evaluated. Detailed instructions are given to perform tests on the various benchmark families with all configurations within the time limits, as well as for retrieving the respective results in CSV format. The artifact has been tested in the virtual machine available at [21].
6 Conclusion
We have presented a strengthening of the Herbrand Theorem, and used it to devise an efficient technique for enumerative instantiation. The implementation of this technique in the stateoftheart SMT solver CVC4 increases its success rate and outperforms existing implementations of MBQI on unsatisfiable problems with quantified formulas. Given its relatively simple implementation, this technique is well poised as an alternative to MBQI for being integrated in an instantiation based SMT solver to achieve completeness in firstorder logic with the empty theory and equality, as well as perform improvements also when theories are considered.
Future work includes further restricting the enumeration space, for instance with ordering criteria in the spirit of resolutionbased theorem proving [3]. Another direction is lifting the techniques seen here to reasoning in higherorder logic. To handle quantification over functions it is often necessary to enumerate expressions, and so performing such an enumeration in a principled manner is paramount for this domain. Techniques from syntaxguided function synthesis [1] could be combined with enumerative instantiation to pursue this goal.
Data Availability Statement and Acknowledgments. The datasets generated and analyzed during the current study are available in the figshare repository: https://doi.org/10.6084/m9.figshare.5917384.v1.
This work was partially funded by the National Science Foundation under Award 1656926, by the H2020FETOPEN20162017CSA project SC\(^\mathsf {2}\) (712689), and by the European Research Council (ERC) starting grant Matryoshka (713999). We would like to thank the anonymous reviewers for their comments. We are grateful to Jasmin C. Blanchette for discussions, encouragements and financial support through his ERC grant.
Footnotes
 1.
The Herbrand Theorem is generally presented in pure firstorder logic without equality, but it also holds for equality: it suffices to consider the equality axioms conjunctively with formulas.
 2.
For details, see http://matryoshka.gforge.inria.fr/pubs/fol_enumerative_inst/.
 3.
In SMT parlance, the logic of these benchmarks is quantified EUF.
 4.
There are technical details that influence the comparison of these techniques (see [26]).
 5.
Number of uniquely solved benchmarks between configurations are available in [26].
 6.
A detailed comparison by families can be seen in [26].
 7.
As a rough estimate, the implementation of enumerative instantiation in CVC4 is around 500 lines of code, whereas modelbased instantiation is around 4500 lines of code.
 8.
For further details, see [26].
References
 1.Alur, R., Bodík, R., Juniwal, G., Martin, M.M.K., Raghothaman, M., Seshia, S.A., Singh, R., SolarLezama, A., Torlak, E., Udupa, A.: Syntaxguided synthesis. In: Formal Methods in ComputerAided Design (FMCAD), pp. 1–8. IEEE (2013)Google Scholar
 2.Baader, F., Nipkow, T.: Term Rewriting and All That. Cambridge University Press, New York (1998)CrossRefGoogle Scholar
 3.Bachmair, L., Ganzinger, H.: Resolution theorem proving. In: Robinson, A., Voronkov, A. (eds.) Handbook of Automated Reasoning, vol. 1, pp. 19–99 (2001)Google Scholar
 4.Barbosa, H.: New techniques for instantiation and proof production in SMT solving. Ph.D. thesis, Université de Lorraine, Universidade Federal do Rio Grande do Norte (2017)Google Scholar
 5.Barbosa, H., Fontaine, P., Reynolds, A.: Congruence closure with free variables. In: Legay, A., Margaria, T. (eds.) TACAS 2017. LNCS, vol. 10206, pp. 214–230. Springer, Heidelberg (2017). https://doi.org/10.1007/9783662545805_13CrossRefGoogle Scholar
 6.Barnett, M., DeLine, R., Fähndrich, M., Jacobs, B., Leino, K.R.M., Schulte, W., Venter, H.: The Spec# programming system: challenges and directions. In: Meyer, B., Woodcock, J. (eds.) VSTTE 2005. LNCS, vol. 4171, pp. 144–152. Springer, Heidelberg (2008). https://doi.org/10.1007/9783540691495_16CrossRefGoogle Scholar
 7.Barrett, C., Fontaine, P., Tinelli, C.: The SMTLIB Standard: Version 2.5. Technical report, Department of Computer Science, The University of Iowa (2015). www.SMTLIB.org
 8.Claessen, K., Lillieström, A., Smallbone, N.: Sort it out with monotonicity. In: Bjørner, N., SofronieStokkermans, V. (eds.) CADE 2011. LNCS (LNAI), vol. 6803, pp. 207–221. Springer, Heidelberg (2011). https://doi.org/10.1007/9783642224386_17CrossRefGoogle Scholar
 9.Claessen, K., Sörensson, N.: New techniques that improve MACEstyle finite model finding. In: Proceedings of the CADE19 Workshop: Model Computation  Principles, Algorithms, Applications (2003)Google Scholar
 10.Cohen, E., Dahlweid, M., Hillebrand, M., Leinenbach, D., Moskal, M., Santen, T., Schulte, W., Tobies, S.: VCC: a practical system for verifying concurrent C. In: Berghofer, S., Nipkow, T., Urban, C., Wenzel, M. (eds.) TPHOLs 2009. LNCS, vol. 5674, pp. 23–42. Springer, Heidelberg (2009). https://doi.org/10.1007/9783642033599_2CrossRefGoogle Scholar
 11.Davis, M., Logemann, G., Loveland, D.: A machine program for theoremproving. Commun. ACM 5(7), 394–397 (1962)MathSciNetCrossRefGoogle Scholar
 12.Davis, M., Putnam, H.: A computing procedure for quantification theory. J. ACM 7(3), 201–215 (1960)MathSciNetCrossRefGoogle Scholar
 13.de Moura, L., Bjørner, N.: Efficient Ematching for SMT solvers. In: Pfenning, F. (ed.) CADE 2007. LNCS (LNAI), vol. 4603, pp. 183–198. Springer, Heidelberg (2007). https://doi.org/10.1007/9783540735953_13CrossRefGoogle Scholar
 14.de Moura, L., Bjørner, N.: Z3: an efficient SMT solver. In: Ramakrishnan, C.R., Rehof, J. (eds.) TACAS 2008. LNCS, vol. 4963, pp. 337–340. Springer, Heidelberg (2008). https://doi.org/10.1007/9783540788003_24CrossRefGoogle Scholar
 15.Detlefs, D., Nelson, G., Saxe, J.B.: Simplify: a theorem prover for program checking. J. ACM 52(3), 365–473 (2005)MathSciNetCrossRefGoogle Scholar
 16.Enderton, H.B.: A Mathematical Introduction to Logic, 2nd edn. Academic Press, Burlington (2001)MATHGoogle Scholar
 17.Ganzinger, H., Korovin, K.: New directions in instantiationbased theorem proving. In: Symposium on Logic in Computer Science, p. 55 (2003)Google Scholar
 18.Ge, Y., Barrett, C., Tinelli, C.: Solving quantified verification conditions using satisfiability modulo theories. In: Pfenning, F. (ed.) CADE 2007. LNCS (LNAI), vol. 4603, pp. 167–182. Springer, Heidelberg (2007). https://doi.org/10.1007/9783540735953_12CrossRefGoogle Scholar
 19.Ge, Y., de Moura, L.: Complete instantiation for quantified formulas in satisfiabiliby modulo theories. In: Bouajjani, A., Maler, O. (eds.) CAV 2009. LNCS, vol. 5643, pp. 306–320. Springer, Heidelberg (2009). https://doi.org/10.1007/9783642026584_25CrossRefGoogle Scholar
 20.Gilmore, P.C.: A proof method for quantification theory: its justification and realization. IBM J. Res. Dev. 4(1), 28–35 (1960)MathSciNetCrossRefGoogle Scholar
 21.Hartmanns, A., Wendler, P.: figshare (2018). https://doi.org/10.6084/m9.figshare.5896615
 22.Korovin, K.: Noncyclic sorts for firstorder satisfiability. In: Fontaine, P., Ringeissen, C., Schmidt, R.A. (eds.) FroCoS 2013. LNCS (LNAI), vol. 8152, pp. 214–228. Springer, Heidelberg (2013). https://doi.org/10.1007/9783642408854_15CrossRefGoogle Scholar
 23.Kovács, L., Voronkov, A.: Firstorder theorem proving and Vampire. In: Sharygina, N., Veith, H. (eds.) CAV 2013. LNCS, vol. 8044, pp. 1–35. Springer, Heidelberg (2013). https://doi.org/10.1007/9783642397998_1CrossRefGoogle Scholar
 24.Prawitz, D.: An improved proof procedure1. Theoria 26(2), 102–139 (1960)MathSciNetCrossRefGoogle Scholar
 25.Reynolds, A.: Conflicts, models and heuristics for quantifier instantiation in SMT. In: Kovács, L., Voronkov, A. (eds.) Vampire Workshop, EPiC Series in Computing, pp. 1–15. EasyChair (2016)Google Scholar
 26.Reynolds, A., Barbosa, H., Fontaine, P.: Revisiting enumerative instantiation. Technical report, University of Iowa, Inria (2018). https://hal.inria.fr/hal01744956
 27.Reynolds, A., Barbosa, H., Fontaine, P.: Revisiting enumerative instantiation  Artifact (2018). figshare https://doi.org/10.6084/m9.figshare.5917384.v1
 28.Reynolds, A., Tinelli, C., de Moura, L.M.: Finding conflicting instances of quantified formulas in SMT. In: Formal Methods In ComputerAided Design (FMCAD), pp. 195–202. IEEE (2014)Google Scholar
 29.Reynolds, A., Tinelli, C., Goel, A., Krstić, S., Deters, M., Barrett, C.: Quantifier instantiation techniques for finite model finding in SMT. In: Bonacina, M.P. (ed.) CADE 2013. LNCS (LNAI), vol. 7898, pp. 377–391. Springer, Heidelberg (2013). https://doi.org/10.1007/9783642385742_26CrossRefGoogle Scholar
 30.Robinson, J.A.: A machineoriented logic based on the resolution principle. J. ACM 12(1), 23–41 (1965)MathSciNetCrossRefGoogle Scholar
 31.Schulz, S.: E  A brainiac theorem prover. AI Commun. 15(2,3), 111–126 (2002)MATHGoogle Scholar
 32.Stump, A., Sutcliffe, G., Tinelli, C.: StarExec: a crosscommunity infrastructure for logic solving. In: Demri, S., Kapur, D., Weidenbach, C. (eds.) IJCAR 2014. LNCS (LNAI), vol. 8562, pp. 367–373. Springer, Cham (2014). https://doi.org/10.1007/9783319085876_28CrossRefGoogle Scholar
 33.Sutcliffe, G.: The TPTP problem library and associated infrastructure. J. Autom. Reasoning 43(4), 337–362 (2009)MathSciNetCrossRefGoogle Scholar
 34.Sutcliffe, G.: The CADE ATP system competition  CASC. AI Mag. 37(2), 99–101 (2016)CrossRefGoogle Scholar
Copyright information
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made. The images or other third party material in this book are included in the book's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the book's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.