Keywords

1 Introduction

The Curry-Howard isomorphism [39] is a correspondence between systems of formal logic and computational lambda-calculi, interpreting propositions as types and proofs as programs (typed lambda-terms). Coq [10] is an interactive proof assistant based on this correspondence. Its underlying logic is the Calculus of Inductive Constructions [10, 33, 46] – an intuitionistic dependent type theory with inductive types.

Because of the complexity and constructivity of the logic, research on automated reasoning for Coq has been sparse so far, limited mostly to specialised tactics for restricted fragments or decidable theories. The automation currently available in Coq is weaker than in proof assistants based on simpler classical foundations, like Isabelle/HOL [29].

We present a practical general fully automated proof search procedure for Coq based on a direct search for type inhabitants. We synthesise Coq terms in an appropriate normal form, using backward-chaining goal-directed search. To make this approach practical, we introduce various heuristics including hypothesis simplification, limited forward reasoning, ordered rewriting and loop checking. For efficiency, we sacrifice completeness for the entire logic of Coq, though we give a completeness proof for a first-order fragment.

We evaluated our procedure on a collection of Coq libraries (40.9% success rate), on CompCert [27] (17.1%) and on the ILTP library [36] (29.5%) of first-order intuitionistic problems. The percentages in brackets denote how many problems were solved fully automatically by the standalone tactic combined with heuristic induction. These results indicate the viability of our approach.

The procedure can be used as a standalone Coq tactic or as a reconstruction backend for CoqHammer [14] – a hammer tool [7] which invokes external automated theorem provers (ATPs) on translations of Coq problems and then reconstructs the found proofs in Coq using the information obtained from successful ATP runs. With our procedure used for reconstruction, CoqHammer achieves a 39.1% success rate on a collection of Coq libraries and 25.6% on CompCert. The reconstruction success rates (i.e. the percentage of problems solved by the ATPs that can be re-proved in Coq) are 87–97%.

1.1 Related Work

A preliminary version of a proof search procedure partly based on similar ideas was described in [14]. That procedure is less complete (not complete for the first-order fragment), slower, much more heuristic, and it performs visibly worse as a standalone tactic (see Sect. 5). It partially includes only some of the actions, restrictions and heuristic improvements described here. In particular, the construction and the unfolding actions are absent, and only special cases of the elimination and the rewriting actions are performed. See Sect. 3.

From a theoretical perspective, a complete proof search procedure for the Cube of Type Systems, which includes the pure Calculus of Constructions without inductive types, is presented in [15]. It is also based on an exhaustive search for type inhabitants. Sequent calculi suitable for proof search in Pure Type Systems are described in [22, 26].

In practice, Chlipala’s crush tactic [9] can solve many commonly occurring Coq goals. However, it is not a general proof search procedure, but an entirely heuristic tactic. In the evaluations we performed, the crush tactic performed much worse than our procedure. For Agda [30] there exists an automated prover Agsy [28] which is, however, not much stronger than Coq’s auto.

Proof search in intuitionistic first-order logic has received more attention than inhabitation in complex constructive dependent type theories. We do not attempt here to provide an overview, but only point the reader to [36] for a comparison of intuitionistic first-order ATP systems. The most promising approaches to proof search in intuitionistic first-order logic seem to be connection-based methods [6, 24, 32, 45]. Indeed, the connection-based ileanCoP [31] prover outperforms other intuitionistic ATPs by a wide margin [36].

An automated intuitionistic first-order prover is available in Coq via the firstorder tactic [11], which is based on a contraction-free sequent calculus extending the LJT system of Dyckhoff for intuitionistic propositional logic [17,18,19]. There also exists a Coq plugin for the connection-based JProver [37]. However, the plugin is not maintained and not compatible with new versions of Coq.

Coq’s type theory may be viewed as an extension of intuitionistic higher-order logic. There exist several automated provers for classical higher-order logic, like Leo [40] or Satallax [8]. Satallax can produce Coq proof terms which use the excluded middle axiom.

The approach to proof search in intuitionistic logic via inhabitation in the corresponding lambda-calculus has a long tradition. It is often an easy way to establish complexity bounds [23, 38, 42]. This approach can be traced back to Ben-Yelles [5, 23] and Wajsberg [43, 44].

One of the motivations for this work is the need for a general automated reasoning procedure in a CoqHammer [14] reconstruction backend. CoqHammer links Coq with general classical first-order ATPs, but tries to find intuitionistic proofs with no additional assumptions and to handle as much of Coq’s logic as possible. A consequence is that the reconstruction mechanism of CoqHammer cannot rely on a direct translation of proofs found by classical ATPs, in contrast to e.g. SMTCoq [2, 21] which integrates external SAT and SMT solvers into Coq.

2 Calculus of Inductive Constructions

In this section, we briefly and informally describe the Calculus of Inductive Constructions (CIC) [10, 33, 46]. For precise definitions and more background, the reader is referred to the literature. Essentially, CIC is a typed lambda calculus with dependent products \(\forall x : \tau . \sigma \) and inductive types.

An inductive type is given by its constructors, presented as, e.g.,

This declares \(\mathtt {list}\,A\) to be a type of sort \(\mathtt {Type}\) for any parameter A of sort \(\mathtt {Type}\). Above A is a parameter and \(\mathtt {Type}\rightarrow \mathtt {Type}\) is the arity of \(\mathtt {list}\). The types of constructors implicitly quantify over the parameters, i.e., the type of \(\mathtt {cons}\) above is \(\forall A : \mathtt {Type}. A \rightarrow \mathtt {list}\,A \rightarrow \mathtt {list}\,A\). In the presentation we sometimes leave the parameter A implicit.

Propositions (logical formulas) are represented by dependent types. Inductive predicates are represented by dependent inductive types, e.g., the inductive type defines a predicate \(\mathtt {Forall}\) on lists, parameterised by a type A and a predicate \(P : A \rightarrow \mathtt {Prop}\). Then \(\mathtt {Forall}\,A\,P\,l\) states that Px holds for every element x of l.

All intuitionistic connectives may be represented using inductive types:

where \(\wedge \) and \(\vee \) are used in infix notation. All the usual introduction and elimination rules are derivable. Equality can also be defined inductively.

Below by t, u, w, \(\tau \), \(\sigma \), etc., we denote terms, by c, \(c'\), etc., we denote constructors, and by x, y, z, etc., we denote variables. We use \(\vec {t}\) for a sequence of terms \(t_1 \ldots t_n\) of an unspecified length n, and analogously for a sequence of variables \(\vec {x}\). For instance, \(t \vec {y}\) stands for \(t y_1 \ldots y_n\), where n is not important or implicit. Analogously, we use \(\lambda \vec {x} : \vec {\tau } . t\) for \(\lambda x_1 : \tau _1 . \lambda x_2 : \tau _2 . \ldots \lambda x_n : \tau _n . t\), with n implicit or unspecified. We write \(t[\vec {u}/\vec {x}]\) for \(t[u_1/x_1]\ldots [u_n/x_n]\).

The logic of Coq includes over a dozen term formers. The ones recognised by our procedure are: a sort s (e.g. Type, Set or Prop), a variable x, a constant, a constructor c, an inductive type I, an application \(t_1t_2\), an abstraction \(\lambda x : t_1 . t_2\), a dependent product \(\forall x : t_1 . t_2\) (written \(t_1 \rightarrow t_2\) if \(x \notin \mathrm {FV}(t_2)\)), and a case expression \(\mathtt {case}(t; \lambda \vec {a} : \vec {\alpha } . \lambda x : I \vec {q} \vec {a} . \tau ; \vec {x_1} : \vec {\sigma _1} \Rightarrow t_1 \mid \ldots \mid \vec {x_k} : \vec {\sigma _k} \Rightarrow t_k)\).

In a case expression: t is the term matched on; I is an inductive type with constructors \(c_1,\ldots ,c_k\); the type of \(c_i\) is \(\forall \vec {p} : \vec {\rho } . \forall \vec {x_i} : \vec {\tau _i} . I \vec {p} \vec {u_i}\) where \(\vec {p}\) are the parameters of I; the type of t has the form \(I \vec {q} \vec {u}\) where \(\vec {q}\) are the values of the parameters; the type \(\tau [\vec {u}/\vec {a},t/x]\) is the return type, i.e., the type of the whole case expression; \(t_i\) has type \(\tau [\vec {w_i}/\vec {a},c_i \vec {q} \vec {x_i}/x]\) in \(\vec {x_i} : \vec {\sigma _i}\) where \(\vec {\sigma _i} = \vec {\tau _i}[\vec {q}/\vec {p}]\) and \(\vec {w_i} = \vec {u_i}[\vec {q}/\vec {p}]\); \(t_i[\vec {v}/\vec {x}]\) is the value of the case expression if the value of t is \(c_i \vec {q} \vec {v}\).

Note that some equality information is “forgotten” when typing the branches of a case expression. We require \(t_i\) to have type \(\tau [\vec {w_i}/\vec {a},c_i \vec {q} \vec {x_i}/x]\) in context \(\vec {x_i} : \vec {\sigma _i}\). We know that “inside” the ith branch \(t = c_i \vec {q} \vec {x_i}\) and \(\vec {u} = \vec {w_i}\), but this information cannot be used when checking the type of \(t_i\). A consequence is that permutative conversions [41, Chapter 6] are not sound for CIC and this is one reason for the incompleteness of our procedure outside the restricted first-order fragment.

Coq’s notation for case expressions is

where \(c_1,\ldots ,c_k\) are all constructors of I, and \(\_\) are the wildcard patterns matching the inductive type parameters \(\vec {q}\). For readability, we often use Coq match notation. When x (resp. \(\vec {a}\)) does not occur in \(\tau \) then we omit from the match. If \(\tau \) does not on either x or \(\vec {a}\), we also omit the .

A typing judgement has the form \(E;\varGamma \vdash t : \tau \) where E is an environment consisting of declarations of inductive types and constant definitions, \(\varGamma \) is a context - a list of variable type declarations \(x : \sigma \), and \(t,\tau \) are terms. We refer to the Coq manual [10] for a precise definition of the typing rules.

Coq’s definitional equality (conversion rule) includes \(\beta \)- and \(\iota \)-reduction:

An inductive type I is non-recursive if the types of constructors of I do not contain I except as the target. We assume the well-foundedness of the relation \(\succ \) defined by: \(I_1 \succ I_2\) iff \(I_2 \ne I_1\) occurs in the arity of \(I_1\) or the type of a constructor of \(I_1\). We write \(I \succ t\) if \(I \succ I'\) for every inductive type \(I'\) occurring in the term t.

3 The Proof Search Procedure

In this section, we describe our proof search procedure. Our approach is based on a direct search for type inhabitants in appropriate normal form [42]. For the sake of efficiency, the normal forms we consider are only a subset of possible CIC normal forms. This leads to incompleteness outside the restricted first-order fragment (see Sects. 3.6 and 4).

More precisely, the inhabitation problem is: given an environment E, a context \(\varGamma \) and a \(\varGamma \)-type \(\tau \) (i.e. \(\varGamma \vdash \tau : s\) for a sort s), find a term t such that \(E;\varGamma \vdash t : \tau \). The environment E will be kept fixed throughout the search, so we omit it from the notation.

A goal is a pair \((\varGamma ,\tau )\) with \(\tau \) a \(\varGamma \)-type, denoted \(\varGamma \vdash ? : \tau \), where \(\varGamma \) is the context and \(\tau \) is the conjecture. A solution of the goal \(\varGamma \vdash ? : \tau \) is any term t such that \(\varGamma \vdash t : \tau \).

3.1 Basic Procedure

The basic inhabitation procedure is to nondeterministically perform one of the following actions, possibly generating new subgoals to be solved recursively. If the procedure fails on one of the subgoals then the action fails. If each possible action fails then the procedure fails. The choices in the actions (e.g. of specific subgoal solutions) are nondeterministic, i.e., we consider all possible choices, each leading to a potentially different solution.

The actions implicitly determine an and-or proof search tree. We leave the exact order in which this tree is traversed unspecified, but a complete search order is to be used, e.g., breadth-first or iterative deepening depth-first.

The procedure supports five term formers in synthesised solutions: variables, constructors, applications, lambda-abstractions, case expressions. These are built with the four actions below.

  1. 1.

    Introduction. If \(\varGamma \vdash ? : \forall x : \alpha . \beta \) then:

    • recursively search for a solution t of the subgoal \(\varGamma , x : \alpha \vdash ? : \beta \);

    • return \(\lambda x : \alpha . t\) as the solution.

  2. 2.

    Application. If \(\varGamma \vdash ? : \tau \) and \(x : \forall \vec {y} : \vec {\sigma } . \rho \) is in \(\varGamma \) then:

    • for \(i=1,\ldots ,n\), recursively search for a solution \(t_i\) of the subgoal \(\varGamma \vdash ? : \sigma _i[t_1/y_1]\ldots [t_{i-1}/y_{i-1}]\);

    • if \(\rho [\vec {t}/\vec {y}] =_{\beta \iota } \tau \) then return \(x \vec {t}\) as the solution.

  3. 3.

    Construction. If \(\varGamma \vdash ? : I \vec {q} \vec {w}\) with \(\vec {q}\) the parameters and \(c : \forall \vec {p} : \vec {\rho } . \forall \vec {y} : \vec {\sigma } . I \vec {p} \vec {u}\) is a constructor of I then:

    • for \(i=1,\ldots ,n\), recursively search for a solution \(t_i\) of the subgoal \(\varGamma \vdash ? : \sigma _i[\vec {q}/\vec {p}][t_1/y_1]\ldots [t_{i-1}/y_{i-1}]\);

    • if \(\vec {u}[\vec {q}/\vec {p}][\vec {t}/\vec {y}] =_{\beta \iota } \vec {w}\) then return \(c \vec {q} \vec {t}\) as the solution.

  4. 4.

    Elimination. If \(\varGamma \vdash ? : \tau \), and \(x : \forall \vec {y} : \vec {\sigma } . I \vec {q} \vec {u}\) is in \(\varGamma \), and \(I : \forall \vec {p} : \vec {\rho } . \forall \vec {a} : \vec {\alpha } . s\) is in E with \(\vec {p}\) the parameters, and \(c_j : \forall \vec {p} : \vec {\rho } . \forall \vec {z_j} : \vec {\gamma _j} . I \vec {p} \vec {u_j}\) for \(j=1,\ldots ,m\) are all constructors of I, then:

    • for \(i=1,\ldots ,n\), recursively search for a solution \(t_i\) of the subgoal \(\varGamma \vdash ? : \sigma _i[t_1/y_1]\ldots [t_{i-1}/y_{i-1}]\);

    • let \(\vec {v} = \vec {u}[\vec {t}/\vec {y}]\) and \(\vec {r} = \vec {q}[\vec {t}/\vec {y}]\);

    • choose \(\tau '\) such that \(\tau '[\vec {v}/\vec {a},x\vec {t}/z] =_{\beta \iota } \tau \);

    • for \(j=1,\ldots ,m\), recursively search for a solution \(b_j\) of \(\varGamma , \vec {z_j} : \vec {\delta _j} \vdash ? : \tau '[\vec {w_j}/\vec {a},c_j\vec {r}\vec {z_j}/z]\) where \(\vec {\delta _j} = \vec {\gamma _j}[\vec {r}/\vec {p}]\) and \(\vec {w_j} = \vec {u_j}[\vec {r}/\vec {p}]\);

    • return \(\mathtt {case}(x \vec {t}; \lambda \vec {a} : \vec {\alpha } . \lambda z : I \vec {r} \vec {a} . \tau '; \vec {z_1} : \vec {\gamma _1} \Rightarrow b_1 \mid \ldots \mid \vec {z_m} : \vec {\gamma _m} \Rightarrow b_m )\) as the solution.

The intuition is that we search for normal inhabitants of a type. For instance, if \(\varGamma \vdash (\lambda x : \alpha . t) u : \tau \) then also \(\varGamma \vdash t[u/x] : \tau \), so it suffices to consider solutions without \(\beta \)-redexes. Assuming \(\tau \) is not a sort, it suffices to consider only variables and constructors at the head of the solution term, because \(I : \forall \vec {x} : \vec {\sigma } . s\) with s a sort for any inductive type I. This of course causes incompleteness because it may be necessary to search for inhabitants of a sort s in a subgoal.

It is straightforward to check (by inspecting the typing rules of CIC) that the described procedure is sound, i.e., any term obtained using the procedure is indeed a solution.

3.2 Search Restrictions

We now introduce some restrictions on the search procedure, i.e., on when each action may be applied. Note that this may compromise completeness, but not soundness. For a first-order fragment completeness is in fact preserved (Sect. 4).

  • Eager introduction. Perform introduction eagerly, i.e., if \(\varGamma \vdash ? : \forall x : \alpha . \beta \) then immediately perform introduction without backtracking.

    This is justified by observing that we may restrict the search to solutions in \(\eta \)-long normal form. However, in general \(\eta \)-long normal forms may not exist.

  • Elimination restriction. Perform elimination only immediately after introduction or another elimination.

    The intuitive justification is that in a term of the form we may usually move u inside the match while preserving the type: However, this is not always possible in CIC (see Sect. 2).

  • Eager simple elimination. Immediately after adding \(x : I \vec {q} \vec {u}\) with parameters \(\vec {q}\) into the context \(\varGamma \) (by the introduction or the elimination action), if I is a non-recursive inductive type and \(I \succ \vec {q}\), then perform elimination of x eagerly and remove the declaration of x from the context.

    If \(\varGamma \vdash (\lambda x : I \vec {q} . t) : \tau \) then usually \(\varGamma \vdash (\lambda x : I \vec {q} . t') : \tau \) where \(t'\) is However, in general replacing a subterm \(u'\) of a term u with \(u''\) may change the type of u, even if \(u',u''\) have the same type. See Sect. 4.

  • Loop checking. If the same conjecture is encountered for the second time on the same proof search tree branch without performing the introduction action in the meantime, then fail.

    This is justified by observing that if \(\varGamma \vdash t[u/x] : \tau \) and \(\varGamma \vdash u : \tau \), then we can just use u instead of t as the solution. In general, this restriction also causes incompleteness, for the same reason as the previous one.

It is instructive to observe how the elimination restrictions specialise to inductive definitions of logical connectives. For example, the eager simple elimination restriction for conjunction is that a goal \(\varGamma , x : \alpha \wedge \beta \vdash ? : \tau \) should be immediately replaced by \(\varGamma , x_1 : \alpha , x_2 : \beta \vdash ? : \tau \).

3.3 Heuristic Improvements

The above presentation of the proof search procedure does not yet directly lead to a practical implementation. We thus introduce “heuristic” improvements. All of them preserve soundness, but some may further compromise completeness. In fact, we believe several of the “heuristics” (e.g. most context simplifications and forward reasoning) actually do preserve completeness (under certain restrictions), but we did not attempt to rigorously prove itFootnote 1.

  • In the application action, instead of checking \(\rho [\vec {t}/\vec {y}] =_{\beta \iota } \tau \) a posteriori, use unification modulo simple heuristic equational reasoning to choose an appropriate \((x : \tau ) \in \varGamma \), possibly introducing existential metavariables to be instantiated later (like with Coq’s eapply tactic). Analogously, we use unification in the construction action.

  • In the elimination action, the choice of \(\tau '\) is done heuristically without backtracking. In practice, we use either Coq’s destruct or inversion tactic, depending on the form of the inductive type I.

  • Immediately after the introduction action, simplify the context:

  • replace \(h : \forall \vec {x} : \vec {\sigma } . \tau _1 \wedge \tau _2\) with \(h_1 : \forall \vec {x} : \vec {\sigma } . \tau _1\) and \(h_2 : \forall \vec {x} : \vec {\sigma } . \tau _2\);

  • replace \(h : \forall \vec {x} : \vec {\sigma } . \tau _1 \vee \tau _2 \rightarrow \rho \) with \(h_1 : \forall \vec {x} : \vec {\sigma } . \tau _1 \rightarrow \rho \) and \(h_2 : \forall \vec {x} : \vec {\sigma } . \tau _2 \rightarrow \rho \);

  • replace \(h : \forall \vec {x} : \vec {\sigma } . \tau _1 \wedge \tau _2 \rightarrow \rho \) with \(h' : \forall \vec {x} : \vec {\sigma } . \tau _1 \rightarrow \tau _2 \rightarrow \rho \);

  • replace \(h : \exists x : \sigma . \tau \) with \(h' : \tau \) (assuming x fresh);

  • remove some intuitionistic tautologies;

  • perform invertible forward reasoning, i.e., if \(h : \sigma \) and \(h' : \sigma \rightarrow \tau \) are in \(\varGamma \) then we replace \(h'\) with \(h'' : \tau \).

  • use Coq’s subst tactic to rewrite with equations on variables;

  • perform rewriting with some predefined lemmas from a hint database.

  • Immediately after simplifying the context as above, perform some limited forward reasoning. For instance, if h : Pa and \(h' : \forall x . P x \rightarrow \varphi \) are in \(\varGamma \), then add \(h'' : P a \rightarrow \varphi [a/x]\) to \(\varGamma \). To avoid looping, we do not use newly derived facts for further forward reasoning.

  • Elimination on terms matched in case expressions is done eagerly. In other words, if occurs in the conjecture or the context, with t closed, then we immediately perform the elimination action on t.

  • After performing each action, simplify the conjecture by reducing it to (weak) normal form (using Coq’s cbn tactic) and rewriting with some predefined lemmas from a hint database.

  • We use a custom leaf solver at the leaves of the search tree. The leaf solver eagerly splits the disjunctions in the context (including quantified ones), uses Coq’s eauto with depth 2, and tries the Coq tactics congruence (congruence closure) and lia (linear arithmetic).

  • We extend the search procedure with two more actions (performed non-eagerly with backtracking):

  1. 1.

    Unfolding. Unfold a Coq constant definition, provided some heuristic conditions on the resulting unfolding are satisfied.

  2. 2.

    Rewriting. The order > on constants is defined to be the transitive closure of \(\{(c_1,c_2) \mid c_2 \mathrm {\ occurs\ in\ the\ definition\ of\ } c_1\}\). By \(\mathrm {lpo}_>\) we denote the lifting of > to the lexicographic path order (LPO) on terms [3, Section 5.4.2]. For the LPO lifting, we consider only terms which have obvious first-order counterparts, e.g., fxyz with f a constant corresponds to a first-order term f(x, y, z). The action is then as follows. Assume \(h : \forall \vec {x} : \vec {\sigma } . t_1 = t_2\) is in \(\varGamma \).

  • If \(\mathrm {lpo}_>(t_1,t_2)\) then rewrite with h from left to right, in the conjecture and the hypotheses, generating new subgoals and introducing existential metavariables for \(\vec {x}\) as necessary.

  • If \(\mathrm {lpo}_>(t_2,t_1)\) then rewrite with h from right to left.

  • If \(t_1,t_2\) are incomparable with \(\mathrm {lpo}_>\), then rewrite heuristically from left to right, or right to left if that fails.

For heuristic rewriting in the last point, we use the leaf solver to discharge the subgoals and we track the hypotheses to avoid unordered heuristic rewriting with the same hypothesis twice.

  • Immediately after forward reasoning, eagerly perform rewriting with those hypotheses which satisfy: (1) the target can be ordered with the lexicographic path order described above, and (2) the generated subgoals can be solved with the leaf solver.

3.4 Search Strategy

The proof search strategy is based on (bounded or iterative deepening) depth-first search. We put a bound on the cost of proof search according to one of the following two cost models.

  • Depth cost model. The depth of the search tree is bounded, with the leaf solver tactic tried at the leaves.

  • Tree cost model. The size of the entire search tree is bounded, but not the depth directly. The advantages of this approach are that (1) it allows to find deep proofs with small branching, and (2) it is easier to choose a single cost bound which performs well in many circumstances. However, this model performs slightly worse on pure first-order problems (see Sect. 5).

3.5 Soundness

Our proof search procedure, including all heuristic improvements, is sound. For the basic procedure (Sects. 3.1 and 3.2) this can be shown by straightforward induction, noting that the actions essentially directly implement CIC typing rules. For the heuristic improvements (Sect. 3.3), one could show soundness by considering the shapes of the proof terms. This is straightforward but tedious. The implementation in the Coq tactic monad guarantees soundness as only well-typed proof terms can be produced by standard Coq tactics.

3.6 Incompleteness

The inhabitation procedure presented above is not complete for the full logic of Coq. The reasons for incompleteness are as follows.

  1. 1.

    Higher-order unification in the application action is not sufficient for completeness in the presence of impredicativity. A counterexample (from [15]) is

    $$ Q : * \rightarrow *, u : \forall P : * . Q P \rightarrow P, v : Q (A \rightarrow B), w : A \vdash ? : B. $$

    The solution is \(u (A \rightarrow B) v w\). The target P of u unifies with B, but this does not provide the right instantiation (which is \(A \rightarrow B\)) and leaves an unsolvable subgoal ? : QB. It is worth noting, however, that this particular example can be solved thanks to limited forward reasoning (see Sect. 3.3).

  2. 2.

    For efficiency, most variants of our tactics do not perform backtracking on instantiations of existential metavariables.

  3. 3.

    The normal forms we implicitly search for do not suffice for completeness. One reason is that permutative conversions [41, Chapter 6] are not sound for dependent case expressions \(\mathtt {case}\) in CIC if the return type \(\tau \) depends on \(\vec {a}\) or x. We elaborate on this in the next section.

  4. 4.

    The full logic of Coq contains more term formers than the five supported by our procedure: fix, cofix, let, ...In particular, our inhabitation procedure never performs induction over recursive inductive types, which requires fix. It does reasoning by cases, however, via the elimination action.

We believe the compromises on completeness are in practice not very severe and our procedure may reasonably be considered a general automated proof search method for CIC without fixpoints. In fact, many of the transformations on proof terms corresponding to the “restrictions” and “heuristics” above would preserve completeness in the presence of definitional proof irrelevance.

4 Normal Forms and Partial Completeness

The basic inhabitation procedure (Sects. 3.1 and 3.2) with restricted looping check is complete for a first-order fragment of CIC. We conjecture that a variant of our procedure is also complete for CIC with definitional proof irrelevance, with only non-dependent elimination, and without fixpoints. First, we describe the subset of normal forms our procedure searches for.

Permutative conversions are the two reductions below. They “move around” case expressions to expose blocked redexes.

In the second reduction rule, P stands for a list of case patterns \(\vec {y_1} : \vec {\sigma _1} \Rightarrow w_1 \mid \ldots \mid \vec {y_m} : \vec {\sigma _m} \Rightarrow w_m\). We assume \(\vec {x_i}\) do not occur in P. Similarly, \(Q,R,R',R''\) stand for the specifications of the return types, where \(Q = \lambda \vec {a} : \vec {\alpha } . \lambda x : I_1 \vec {q_1} \vec {a} . I \vec {v}\), \(R = \lambda \vec {b} : \vec {\beta } . \lambda x : I_2 \vec {q_2} \vec {b} . \tau \), \(R' = \lambda \vec {a} : \vec {\alpha } . \lambda x : I_1 \vec {q_1} \vec {a} . \tau \), \(R'' = \lambda \vec {a} : \vec {\alpha } . \lambda x : I_1 \vec {q_1} \vec {a} . \tau \).

We write \(\rightarrow _\rho \) for the union of \(\rightarrow _{\rho _1}\) and \(\rightarrow _{\rho _2}\). Note that \(\rho _1\)-reduction may create \(\beta \)-redexes, and \(\rho _2\)-reduction may create \(\iota \)-redexes.

The right-hand sides of the \(\rho \)-rules may be ill-typed if \(\sigma ,\tau \) above depend on any of \(\vec {a},\vec {b},x\), i.e., if the return type varies across the case expression branches. Moreover, even if the type of a \(\rho \)-redex subterm is preserved by a \(\rho \)-contraction, the type of the entire term might not be. For example, assume the following are provable in context \(\varGamma \): \(P : A \rightarrow s\), \(F : \forall x : A . P x\), t : A, \(t' : A\) and p : Pt. Then \(\varGamma \vdash F t : P t\) but in general \(\varGamma \not \vdash F t' : P t\) unless \(t =_{\beta \iota } t'\).

An analogous problem occurs when attempting to define \(\eta \)-long normal forms – normal forms \(\eta \)-expanded as much as possible without creating \(\beta \)-redexes. The \(\eta \)-expansion of a term t of type \(\forall x : \alpha . \beta \) is \(\lambda x : \alpha . t x\) where \(x \notin \mathrm {FV}(t)\). We do not consider \(\eta \)-expansions for inductive types. If the conversion rule does not include \(\eta \) (Coq’s does since v8.4), then \(\eta \)-expanding a subterm may change the type of the entire term. Even assuming the conversion rule does include \(\eta \), defining \(\eta \)-long forms in the presence of dependent types is not trivial if we consider \(\eta \)-expansions inside variable type annotations [16]. However, for our purposes a simpler definition by mutual induction on term structure is sufficient.

A term t is a long normal form in \(\varGamma \) (\(\varGamma \)-lnf) if:

  • \(t = \lambda x : \alpha . t'\), and \(\varGamma \vdash t : \forall x : \alpha . \beta \), and \(t'\) is a \(\varGamma \)-\((x:\alpha )\)-lnf (defined below);

  • \(t = x \vec {u}\), and \(\varGamma \vdash t : \tau \) with \(\tau \) not a product, and each \(u_i\) is a \(\varGamma \)-lnf and not a case expression;

  • \(t = c \vec {q} \vec {v}\), and \(\varGamma \vdash t : I \vec {q} \vec {w}\) with \(\vec {q}\) the parameters, and c is a constructor of I, and each \(v_i\) is a \(\varGamma \)-lnf and not a case expression;

  • \(t = \mathtt {case}(x\vec {u}; \lambda \vec {a} : \vec {\alpha } . \lambda x : I \vec {q} \vec {a} . \sigma ; \vec {x_1} : \vec {\tau _1} \Rightarrow t_1 \mid \ldots \mid \vec {x_k} : \vec {\tau _k} \Rightarrow t_k)\), and \(\varGamma \vdash t : \tau \) with \(\tau \) not a product, and each \(u_i\) is a \(\varGamma \)-lnf and not a case expression, and each \(t_i\) is a \(\varGamma \)-\((\vec {x_i}:\vec {\tau _i})\)-lnf.

A term t is a \(\varGamma \)-\(\varDelta \)-lnf if:

  • \(\varDelta = \langle \rangle \) and t is a \(\varGamma \)-lnf;

  • \(\varDelta = x : \alpha ,\varDelta '\), and \(\alpha \) is not \(I \vec {q} \vec {u}\) for I non-recursive with \(I \succ \vec {q}\), and t is a \(\varGamma ,x:\alpha \)-\(\varDelta '\)-lnf;

  • \(\varDelta = x : I \vec {q} \vec {u},\varDelta '\), and I is non-recursive with \(I \succ \vec {q}\), and \(\vec {q}\) are the parameter values, and \(t = \mathtt {case}(x; \lambda \vec {a} : \vec {\alpha } . \lambda x : I \vec {q} \vec {a} . \tau ; \vec {x_1} : \vec {\tau _1} \Rightarrow t_1 \mid \ldots \mid \vec {x_k} : \vec {\tau _k} \Rightarrow t_k)\), and \(\varGamma ,\varDelta \vdash t : \tau [\vec {u}/\vec {a}]\), and each \(t_i\) is a \(\varGamma \)-\(\varDelta ',\vec {x_i}:\vec {\tau _i}\)-lnf (then \(x \notin \mathrm {FV}(t_i)\)).

For the supported term formers (variables, constructors, applications, lambda-abstractions, case expressions), this definition essentially describes \(\eta \)-long \(\beta \iota \rho \)-normal forms transformed to satisfy the additional restrictions corresponding to the elimination and the eager simple elimination restrictions from Sect. 3.2.

Given an inhabitation problem \(\varGamma \vdash ? : \tau \), our procedure searches for a minimal solution in \(\varGamma \)-lnf. Solutions in \(\varGamma \)-lnf might not exist for some solvable problems. As outlined above, there are essentially two reasons: (1) with dependent elimination the return type may vary across case branches, which in particular makes permutative conversions unsound; (2) replacing a proof with a different proof of the same proposition is not sound if proofs occur in types. Point (1) may be dealt with by disallowing dependent elimination, and (2) by assuming definitional proof irrelevance. Hence, we conjecture completeness (of an appropriate variant of the procedure) for CIC with definitional proof irrelevance, with only non-dependent elimination, and without fixpoints.

Here we only prove that in a first-order fragment for every inhabited type there exists an inhabitant in \(\varGamma \)-lnf. The precise definition of the considered first-order fragment may be found in the appendix. It is essentially intuitionistic first-order logic with two predicative sorts \(\mathtt {Prop}\) and \(\mathtt {Set}\), non-dependent inductive types in \(\mathtt {Prop}\), non-dependent pattern-matching, and terms of a type in \(\mathtt {Set}\) restricted to applicative form. We use \(\vdash _\mathrm {fo}\) for the typing judgement of the first-order fragment.

For the theorem below, we consider a basic variant of our procedure (Sects. 3.1 and 3.2) which does not perform the looping check for conjectures of sort \(\mathtt {Set}\).

Theorem 1 (Completeness for a first-order fragment)

If the inhabitation problem \(\varGamma \vdash _{\mathrm {fo}} ? : \tau \) has a solution, then the inhabitation procedure will find one.

Proof (sketch)

Assume \(\varGamma \vdash _{\mathrm {fo}} t : \tau \). It suffices to show that t may be converted into a \(\varGamma \)-lnf with the same type.

First, one shows that \(\beta \iota \rho \)-reduction enjoys subject reduction and weak normalisation. The weak normalisation proof is analogous to [41, Theorem 6.1.8].

Next, one shows that \(\beta \iota \rho \)-normal forms may be expanded to \(\eta \)-long \(\beta \iota \rho \)-normal forms. Some care needs to be taken to avoid creating a \(\rho \)-redex when expanding a case expression.

Finally, one performs transformations corresponding to the elimination and the eager simple elimination restrictions. See the appendix for details.

5 Evaluation

We performed several empirical evaluations of our proof search procedure. First, on a collection of a few Coq libraries and separately on CompCert [27], we measured the effectiveness of the procedure as a standalone proof search tactic, as well as its effectiveness as a reconstruction backend for CoqHammer. We also measured the effectiveness of our procedure on pure intuitionistic first-order logic by evaluating it on the ILTP library [36] of first-order intuitionistic problems.

Our proof search procedure intends to provide general push-button automation for CIC without fixpoints, based on sound theoretical foundations. As such, it is in a category of its own, as far we know. Our evaluations in several different scenarios indicate the practical viability of our approach despite its generality. It should be noted that the tactics we compare against are not intended for full automation, but target specific small fragments of CIC or require hand-crafted hints for effective automation.

Detailed evaluation results, complete logs, Coq problem files and conversion programs are available in the online supplementary material [12]. The collection of Coq libraries and developments on which we evaluated our procedure includes: coq-ext-lib library, Hahn library, int-map (a library of maps indexed by binary integers), Coq files accompanying the first three volumes of the Software Foundations book series [1, 34, 35], a general topology library, several other projects from coq-contribs. The full list is available at [12].

The results of the standalone (left) and CoqHammer backend (right) evaluation on 4494 problems from a collection of Coq developments, and seperately the results on CompCert are presented below.

figure k

For the evaluation on CompCert, the number of problems for the CoqHammer backend evaluation is smaller because the CoqHammer translation cannot handle some Coq goals (e.g. with existential metavariables) and these were not included.

For the standalone evaluation, we first try to apply our procedure, and if it fails then we try heuristic unfolding and then try to do induction on each available hypothesis followed by the tactic. This gives a better idea of the usefulness of our procedure because the core tactic itself cannot succeed on any problems that require non-trivial induction. For comparison, an analogous combination of standard Coq tactics (or crush) with unfolding and induction is used.

For the standalone evaluation, by “sauto+i” we denote our proof search procedure, by “yelles+i” the preliminary procedure form [14], by “crush+i” a slightly improved version of the crush tactic from [9], by “coq+i” a mix of standard Coq automation tactics (including eauto, lia, congruence, firstorder). All these are combined with induction, etc., as described above.

We also performed a standalone evaluation without combining the tactics with induction or unfolding. The results are presented below. For the standalone evaluation without induction, by “coq-no-fo” we denote the same mix of standard Coq automation tactics as “coq” but not including the firstorder tactic.

figure l

The results of the standalone evaluations indicate that our procedure is useful as a standalone Coq tactic in a push-button automated proof search scenario, performing comparably or better than other tactics available for Coq.

For the evaluation of our procedure as a CoqHammer backend, we use 12 variants of our tactics (including 3 variants based on the incomplete preliminary procedure from [14]) run in parallel (i.e. a separate core assigned to each variant) for 30s (“sauto-12” row). We included the variants of the preliminary procedure form [14] to increase the diversity of the solved problems. The procedure from [14], while much more ad-hoc and heuristic, is essentially a less general version of the present one. The point of this evaluation is to show that our approach may be engineered into an effective CoqHammer reconstruction backend, and not to compare the present procedure with its limited ad-hoc variant. For comparison, we used 4 variants of combinations of standard Coq automation tactics (“coq-4” row). We show the total number and the percentage of problems solved with any of the external provers and premise selection methods employed by CoqHammer. The external provers were run for 30s each. The reconstruction success rates (“re-proved” column) are calculated separately for each prover and a range is presented.

A common property of the chosen libraries is that they use the advanced features of Coq sparingly and are written in a style where proofs are broken up into many small lemmas. Some do not use much automation, and the Software Foundations files contain many exercises with relatively simple proofs. Moreover, some of the developments modify the core hints database which is then used by the tactics. The resulting problem set is suitable for comparing proof search procedures on a restricted subset of Coq logic, but does not necessarily reflect Coq usage in modern developments. This explains the high success rate compared to CompCert. Also, CompCert uses classical logic, while our procedure tries to find only intuitionistic proofs. Hence, a lower success rate is to be expected since it is harder or impossible to re-prove some of the lemmas constructively.

The results of the evaluation on the ILTP library v1.1.2 [36] follow.

ILTP (2574 problems, 600s)

tactic

proved

proved %

0.00

0.25

0.50

0.75

1.00

hprover

662

25.7%

96.3%

52.1%

72.7%

43.9%

12.9%

tprover

636

24.7%

96.3%

52.1%

52.7%

44.6%

11.8%

yelles

602

23.4%

79.4%

40.1%

52.7%

47%

12.2%

sauto-3

760

29.5%

     

hprove10

565

22.0%

90.8%

40.8%

50.9%

38.7%

10.3%

firstorder

467

18.1%

95.4%

56.3%

58.2%

20%

6.7%

ileanCoP

934

36.3%

97.7%

95.8%

96.4%

95.8%

16.8%

figure m

We compared our procedure with firstorder [11] and with ileanCoP 1.2 [31] – a leading connection-based first-order intuitionistic theorem prover. We converted the library files to appropriate formats (Coq or ileanCoP). For ileanCoP and firstorder, the converted problem files include equality axioms (reflexivity, symmetry, transitivity, congruence). These axioms were not added for our procedure because it can already perform limited equality reasoning. We used exhaustive variants of our tactics which perform backtracking on instantiations of existential metavariables and do not perform eager simple elimination, eager or unordered rewriting. The proof search procedure is run in an iterative deepening fashion, increasing the depth or cost bound on failure. The “hprover” row shows the result for the depth cost model, “tprover” for the tree cost model, “yelles” for the preliminary procedure from [14], and “sauto-3” the combination of the results for the above three. The columns labeled with a number R show the percentage of problems with difficulty rating R for which proofs were found. The graph below the table shows how many problems were solved within a given time limit.

The firstorder tactic is generally faster than our procedure, but it finds much fewer proofs for the problems with high difficulty rating. For firstorder we did not implement iterative deepening, because the depth limit is a global parameter not changeable at the tactic language level. We set the limit to 10. To provide a fair comparison, we also evaluated our proof search procedure with the depth cost model and the depth bound fixed at 10 (“hprove10”).

In combination, the three iterative deepening variants of our procedure managed to find proofs for 80 theorems that were not proven by ileanCoP. Overall, the performance of ileanCoP is much better, but it does not produce proof terms and is restricted to pure intuitionistic first-order logic.

6 Examples

In this section, we give some examples of the use of our proof search procedure as a standalone Coq tactic. The core inhabitation procedure is implemented in the sauto tactic which uses the tree cost model and bounds the proof search by default. There are several other tactics which invoke different variants of the proof search procedure. The ssimpl tactic performs the simplifications, forward reasoning and eager actions described in Sects. 3.2 and 3.3. The implementation is available as part of a recent version of the CoqHammer tool [13, 14], and it is used as the basis of its reconstruction tactics.

Our first example is a statement about natural numbers. It can be proven by sauto without any lemmas because the natural numbers, disjunction, existential quantification and equality are all inductive types.

Note that because the proof requires inversion on nat, it cannot possibly be created by any of the standard Coq automation tactics.

Because < is defined in terms of \(\le \) which is an inductive type, sauto can prove the following lemma about lists.

The next example concerns big-step operational semantics of simple imperative programs. The commands of an imperative program are defined with an inductive type cmd. The big-step operational semantics is represented with a dependent inductive type , and command equivalence is defined in terms of . We skip the details of these definitions.

Then sauto can fully automatically prove the following two lemmas. The first one states the associativity of command sequencing. The second establishes the equivalence of the while command with its one-step unfolding. On a computer with a 2.5 GHz processor, in both cases sauto finds a proof in less than 0.5 s.

Note again that both proofs require multiple inversions, and thus it is not possible to obtain them with standard Coq automation tactics.

According to Kunze [25], the following set-theoretical statement cannot be proven in reasonable time by either firstorder or JProver. The sauto tactic finds a proof in less than 0.3s. Below Seteq, Subset and In are variables of type \(U \rightarrow U \rightarrow \mathtt {Prop}\); Sum of type \(U \rightarrow U \rightarrow U\); and \(U : \mathtt {Type}\).

7 Conclusions and Future Work

We presented a practical general proof search procedure for Coq based on type inhabitation. This increases the power of out-of-the-box automation available for Coq and provides an effective reconstruction backend for CoqHammer. The empirical evaluations indicate that our approach to fully automated proof search in the Calculus of Inductive Constructions is practically viable.

For efficency reasons, the inhabitation procedure is not complete in general, but it is complete for a first-order fragment of the Calculus of Inductive Constructions. We conjecture that a variant of our procedure could be shown complete for the Calculus of Inductive Constructions with definitional proof irrelevance, with only non-dependent elimination, and without fixpoints.

We implemented the proof search procedure in OCaml and Ltac as a Coq plugin. The plugin generates values in the Coq’s tactic monad which contain callbacks to the plugin. Better efficiency would probably be achieved by directly generating Coq proof terms or sequences of basic tactics. This would, however, require much engineering work. Another disadvantage of the monadic implementation is that it limits the proof search strategy to depth-first order and precludes global caching. In the artificial intelligence literature, there are many approaches to graph and tree search [20] which might turn out to be better suited for an inhabitation procedure than the depth-first tree search.