Extensible and Efficient Automation Through Reflective Tactics
 10 Citations
 759 Downloads
Abstract
Foundational proof assistants simultaneously offer both expressive logics and strong guarantees. The price they pay for this flexibility is often the need to build and check explicit proof objects which can be expensive. In this work we develop a collection of techniques for building reflective automation, where proofs are witnessed by verified decision procedures rather than verbose proof objects. Our techniques center around a verified domain specific language for proving, \(\mathcal {R}_{tac}\), written in Gallina, Coq’s logic. The design of tactics makes it easy to combine them into higherlevel automation that can be proved sound in a mostly automated way. Furthermore, unlike traditional uses of reflection, \(\mathcal {R}_{tac}\) tactics are independent of the underlying problem domain, which allows them to be retasked to automate new problems with very little effort. We demonstrate the usability of \(\mathcal {R}_{tac}\) through several case studies demonstrating orders of magnitude speedups for relatively little engineering work.
1 Introduction
Foundational proof assistants provide strong guarantees about properties expressed in rich logics. They have been applied to reason about operating systems [29], compilers [32], databases [35], and cyberphysical systems [18], just to name a few. In all of these contexts, users leverage the expressivity of the underlying logic to state their problem and use the automation provided by the proof assistant, often in the form of a “tactic language,” to prove the properties they care about, e.g. the correctness of a compiler.
The problem with this rosy picture is that foundational proofs are large because they need to spell out the justification in complete detail. This is especially true in rich dependent type theories such as Coq [17] and Agda [1] where proofs are firstclass objects with interesting structural and computational properties. Several recent projects have made the overhead of proof objects painfully clear: the Verified Software Toolchain [2], CertiKOS [41], and Bedrock [15]. All of these projects run into resource limitations, often in memory, but also in build times. Proof generating automation tends to work well when applied to small problems, but scales poorly as problems grow. To solve this problem we need a way to extend the proof checker in a trustworthy way so that it can check certain properties more efficiently. Doing this allows us to dramatically improve both the time and memory usage of foundational verification.
The expressivity of foundational constructive proof assistants provides a technique for “extending the proof checker”: computational reflection [5, 6, 9, 30, 40]. Using computational reflection, a developer implements a custom program to check properties outright and proves that the program is sound, i.e. if it claims that a property is provable, then a proof of the property exists. With computational reflection, previously large proof objects can be replaced by calls to these custom decision procedures. Executing these procedures can be much more efficient than type checking explicit proof terms. However, the current approach to building these reflective procedures makes them difficult to construct, cumbersome to adapt to new instances, and almost impossible to compose to build higherlevel automation.
In this work we show how to easily build reflective automation that is both extensible and efficient. Our approach is to separate the core reflective language from the domainspecific symbols that the automation reasons about. This enables us to build a Domain Specific Language (DSL), \(\mathcal {R}_{tac}\), for reflective automation that is reusable across any base theory. We show that naïvely translating automation from \(\mathcal {L}_{tac}\), Coq’s builtin tactic language, to \(\mathcal {R}_{tac}\) typically leads to at least an order of magnitude speedup on reasonable problem sizes.

We develop MirrorCore, a reusable Coq library for general and extensible computational reflection built around a core \(\lambda \)calculus (Sect. 4).

We build \(\mathcal {R}_{tac}\)—the first foundational, featurerich reflective tactic language (Sect. 5). By separating domainspecific meaning from generic manipulation, \(\mathcal {R}_{tac}\) is able to provide highlevel building blocks that are independent of the underlying domain. Additionally the userexposed interface makes it easy to combine these general tactics into larger, domainspecific automation and derive the corresponding soundness proofs essentially for free.

We demonstrate the extensible nature of \(\mathcal {R}_{tac}\) by developing a reflective setoid rewriter as an \(\mathcal {R}_{tac}\) tactic (Sect. 6). The custom implementation follows the MirrorCore recipe of separating the core syntax from the domainspecific symbols allowing the rewriter to be retasked to a range of problem domains. In addition, the procedure is higherorder and can invoke \(\mathcal {R}_{tac}\) tactics to discharge side conditions during rewriting.

We evaluate MirrorCore and \(\mathcal {R}_{tac}\) by developing reflective procedures for different domains (Sect. 7). We show that our automation has substantially better scaling properties than traditional \(\mathcal {L}_{tac}\) automation has and is quite simple to write in most cases.
Before concluding, we survey a range of related work related to automation alternatives in proof assistants (Sect. 8).
All of our results have been mechanized in the Coq proof assistant. The MirrorCore library and examples are available online:
2 A Computational Reflection Primer
3 \(\mathcal {R}_{tac}\) from the Client’s Perspective
Before delving into the technical machinery that underlies our framework we highlight the end result. In Fig. 1, we excerpt an \(\mathcal {R}_{tac}\) implementation of the monoidal equivalence checker described in Sect. 2^{1}. The automation builds directly on the Coq definitions of commutative monoids excerpted in step (0).
The first step is to build a data type that can represent the properties that we care about. In Sect. 2 we built a custom data type and spelled out all of the cases explicitly. Here, we build the syntactic representation by instantiating MirrorCore ’s generic language ( Open image in new window ) with domain specific types ( Open image in new window ) and symbols ( Open image in new window ). As we will see in Sect. 4, the Open image in new window language provides a variety of features that are helpful when building automation. Once we have defined the syntax, we use MirrorCore ’s programmable reification plugin to automatically construct syntactic representations of the lemmas that we will use in our automation (step (2)).
In step (3), we use these lemmas to write our reflective automation using the \(\mathcal {R}_{tac}\) DSL. The entry point to the automation is the Open image in new window tactic but the core procedures are Open image in new window and Open image in new window which permute the left ( Open image in new window ) and righthand sides ( Open image in new window ) of the equality until matching terms can be cancelled. Each tactic consists of a bounded recursion ( Open image in new window ) where the body tries one of several tactics ( Open image in new window ). For example, Open image in new window tries to apply Open image in new window to remove a unit element from the righthandside. The double semicolon sequences two tactics together. It applies the second tactic to all ( Open image in new window ) goals produced by the first or applies a list of tactics one to each generated subgoal ( Open image in new window ).
In step (4) we prove the soundness of the automation. Soundness proofs are typically a major part of the work when developing reflective tactics; however, the compositional nature of \(\mathcal {R}_{tac}\) tactics makes proving soundness almost completely automatic. The Open image in new window (\(\mathcal {L}_{tac}\)) tactic proves the soundness of a tactic by composing the soundness of its individual pieces.
Finally, we use Open image in new window to verify equivalences in the monoid (step (5)). On small problems, the difference between \(\mathcal {L}_{tac}\) and our technique is negligible. However, for large problem sizes, our automation performs several orders of magnitude faster. We defer a more detailed evaluation to Sect. 7.
Building and Evolving Automation. While the goal of automation is a “pushbutton” solution, it rarely starts out that way. The automation shown in Fig. 1, like most \(\mathcal {R}_{tac}\) automation, was constructed incrementally in much the same way that users build automation using \(\mathcal {L}_{tac}\) [14]. The developer inspects the goal and finds the next thing to do to make progress. This same process works when developing automation in \(\mathcal {R}_{tac}\). When the developer runs a tactic that does not solve the goal, a new goal is returned showing what is left to prove. By default, the process of constructing the syntactic representation is hidden from the user and new goals are returned after they have been converted back into their semantic counterparts.
It is important to note, that while we can develop tactics incrementally, \(\mathcal {R}_{tac}\) is not built to do manual proofs in the style of \(\mathcal {L}_{tac}\). When run alone, the core \(\mathcal {R}_{tac}\) tactics (e.g. Open image in new window ) are often slower than their \(\mathcal {L}_{tac}\) counterparts. \(\mathcal {R}_{tac}\) ’s speed comes from the ability to replace large proof terms with smaller ones, and larger proofs only arise when combining multiple reasoning steps.
4 The MirrorCore Language
A key component of reflective automation is the syntactic representation of the problem domain. We need a representation that is both expressive and easy to extend. In this section we present MirrorCore ’s generic Open image in new window language which we used in Sect. 3.
Following the standard presentation of the \(\lambda \)calculus, the language is divided into two levels: types and terms. The type language is completely userdefined but has two requirements. First, it must have a representation of function types so that \(\lambda \)abstractions and applications can be typed. Second, it requires decidable equality to ensure that type checking Open image in new window terms is decidable. In order to use \(\mathcal {R}_{tac}\) (which we discuss in Sect. 5) the type language also requires a representation of Coq’s type of propositions ( Open image in new window ).
The term language follows a mostly standard presentation of the \(\lambda \)calculus using De Bruijn indices to represent bound variables. To support binders, the denotation function of terms \(\left( {^{t_u}_{t_v}\llbracket {}\rrbracket _{t}}\right) \) is parameterized by two type environments (for unification variables, \(t_u\), and regular variables, \(t_v\)) and the result type (t). These three pieces of information give us the type of the denotation. Concretely, the meaning of a term is a Coq function from the two environments (with types \({\llbracket {t_u}\rrbracket ^{\overrightarrow{\tau }}}\) and \({\llbracket {t_v}\rrbracket ^{\overrightarrow{\tau }}}\)) to the result type (\({\llbracket {t}\rrbracket ^\tau }\)). If the term is illtyped then it has no denotation, which we encode using Coq’s Open image in new window type. The denotation function returns Open image in new window with the denotation of the term if it is welltyped, or Open image in new window if the term is illtyped^{2}. The choice to use Coq’s function space for the denotation means that some of the theorems in MirrorCore rely on the axiom of functional extensionality.
5 \(\mathcal {R}_{tac}\): Verification Building Blocks
In this section we implement a fullyreflective proving language, called \(\mathcal {R}_{tac}\), modeled on \(\mathcal {L}_{tac}\), Coq’s builtin tactic language. \(\mathcal {R}_{tac}\) allows clients to build completely reflective automation easily without ever needing to write Coq functions that inspect terms. In fact, after the reification step, \(\mathcal {R}_{tac}\) almost completely encapsulates the fact that we are using computational reflection at all.
\(\mathcal {R}_{tac}\) packages unification variables, premises, and a conclusion, into a “goal” that programs (also called tactics) operate on. Combining these pieces into a simple interface allows \(\mathcal {R}_{tac}\) to cleanly export larger granularity operations that rely on multiple pieces. For example, a common operation is to apply a lemma to the conclusion and convert its premises into new goals [36]. Doing this requires inspecting the lemma, constructing new unification variables, performing unifications, constructing new goals, and justifying it all using the lemma’s proof.
Implementing \(\mathcal {R}_{tac}\) requires solving two intertwined problems. In Sect. 5.1, we describe how we represent and reason about proofs embedded in arbitrary contexts which contain regular variables, unification variables, and propositions. Our reasoning principles allow us to make inferences under these contexts, and evolve the contexts by instantiating unification variables in the context without needing to recheck proofs. In Sect. 5.2, we present our compositional phrasing of tactic soundness which allows us to easily compose sound tactics to produce sound automation. We close the section (Sect. 5.3) by discussing \(\mathcal {R}_{tac}\) ’s clientfacing interface including a subset of the tactics that we have implemented and an example using them to build a small piece of automation.
5.1 Contexts and Contextualized Proofs
The intuitive interpretation of contexts, denoting unification variables as existential quantifiers, captures that property that they can be choosen in the proof, but this interpretation is not sufficient for compositional proofs. To see why, consider a proof of \({\llbracket {c}\rrbracket _{}^{\mathcal C}\vdash {P}}\) and \({\llbracket {c}\rrbracket _{}^{\mathcal C}\vdash {Q}}\). We would like to combine these two proofs into a proof of \({\llbracket {c}\rrbracket _{}^{\mathcal C}\vdash {(P \wedge _c Q)}}\) but we can not because the two proofs may make contradictory choices for existentially quantified values. For example, if c is \(\exists x : \mathbb {N}\), P is \(x = 0\), and Q is \(x = 1\) both proofs exist independently by picking the appropriate value of x but the two do not compose. To solve this problem, we use the parametric interpretation of contexts defined in Fig. 3 where unification variables are interpreted as universal quantifiers with an equation if they are instantiated. This interpretation captures the parametricity necessary to compose proofs by ensuring that proofs that do not constrain the values of unification variables hold for any welltyped instantiation.
5.2 Implementing \(\mathcal {R}_{tac}\)
The contextual reasoning principles from the previous section form the heart of the proof theory of \(\mathcal {R}_{tac}\). In this section, we describe the generic language constructs and their soundness criteria.
In \(\mathcal {R}_{tac}\) we solve the scoping problem using the context. Unification variables introduced in the context are only allowed to mention variables and unification variables that are introduced below it. For example, in ‘\(c,\exists _\tau {=}e\)’, e is only allowed to mention variables and unification variables introduced by c.
This design choice comes with a deficiency when introducing multiple unification variables. For example, if we wish to introduce multiple unification variables, we need to pick an order of those unification variables and the choice is important because we can not instantiate an earlier unification variable using a later one. While there can never be cycles, the order that we pick is significant. Our solution is to introduce mutually recursive blocks of unification variables simultaneously. The reasoning principles for these blocks are quite similar to the reasoning principles that we presented in this section and the last, but there is a bit more bookkeeping involved.
5.3 The Core Tactics
6 Extending \(\mathcal {R}_{tac}\): Exploiting Structure in Rewriting
\(\mathcal {R}_{tac}\) allows users to develop goaloriented automation with a tight coupling between the code we run and the justification of its soundness. In many cases, this is just what the doctor ordered; however, because \(\mathcal {R}_{tac}\) is defined within Coq’s logic, we can implement custom automation that interoperates directly with \(\mathcal {R}_{tac}\). In this section we present a custom \(\mathcal {R}_{tac}\) tactic for setoid rewriting—we will explain what this means in a moment. In addition to being an \(\mathcal {R}_{tac}\) tactic, our setoid rewriter is parameterized by lemmas to rewrite with and tactics to discharge the premises of the lemmas.
Writing custom tactics is an advanced topic that is still quite difficult, though we are working on easing the process. The benefit of writing a custom tactic is that we can customize the implementation to exploit additional structure that is present in the problem domain.
Rewriting. Formally, rewriting answers the question: “what term is equal to e using a set of equational lemmas?” Setoid rewriting generalizes the use of equality in this question to arbitrary relations. For example, the Fiat system for deductive synthesis [20] is built around rewriting using refinement relations.

General relations, including both nonreflexive and nontransitive ones,

Rewriting in function arguments and under binders,

Hint databases which store the lemmas to use in rewriting, and

Discharging sideconditions using \(\mathcal {R}_{tac}\) tactics.
The reflective implementation of our rewriter allows us to avoid explicitly constructing proof terms which, as we show in Sect. 7, results in substantial performance improvements. In addition, exporting the rewriter as a tactic makes it easy to integrate into larger reflective automation.
We start the section off with an explanation of the mechanism underlying setoid rewriting using a simple example (Sect. 6.1). In Sect. 6.2 we show how we exploit this structure to make the rewriter more effective. We conclude by presenting the clientfacing interface to the rewriter (Sect. 6.3).
6.1 Setoid Rewriting by Example
When the rewriter recurses to solve the first obligation (\(P \dashv \; ?_1\)) it finds that there is no explicit proof about P and \(\dashv \). However, the reflexivity of the \(\dashv \) relation allows the rewriter to use P to instantiate \(?_1\), solving the goal. While this may seem obvious, this check is necessary to support rewriting by nonreflexive relations since, for arbitrary relations, there may be no term related to P.
The problem gets a bit more interesting when we rewrite the body at the pointwise relation. The definition of ‘\(\textsf {pointwise\_relation}\,\mathbb {N}\,\dashv \)’ makes it clear that we can rewrite in the function body as long as the two bodies are related by \(\dashv \) when applied to the same x, so we will shift a universally quantified natural number into our context and begin rewriting in the body.
Rewriting in the rest of the term is similar. The only complexity comes from determining the appropriate morphism for Q. First, if we do not find a morphism for Q, we can still rewrite \(Q\,(x+1)\) into \(Q\,(x+1)\) by the reflexivity of \(\dashv \) but this prevents us from rewriting the addition. The solution is to derive \(\textsf {Proper}\,(=\;\Longrightarrow \;\dashv )\,Q\) by combining the fact that all Coq functions respect equality, i.e. \(\textsf {Proper}\,(=\;\Longrightarrow \;=)\,Q\), and the reflexivity of \(\dashv \).
Rewriting on the Way Up. Eventually, our rewriter will hit the bottom of the term and begin coming back up. It is during this process that we make use of the actual rewriting lemmas. For example, take applying the commutativity of addition on \(x + 1\). Our rewriter just solved the recursive relations stating that \(x = x\) and \(1 = 1\) so we have a proof of \(x + 1 = x + 1\). However, because equality is transitive, we can perform more rewriting here. In particular, the commutativity of addition justifies rewriting \(x+1\) into \(1 + x\).
The new result logically fits into our system but the justification is a bit strained. The recursive rewriting above already picked a value for the unification variable that this subproblem was solving. Noting this issue, we realize that we should have appealed to transitivity before performing the recursive rewriting. Doing this requires a bit of foresight since blindly applying transitivity could yield in an unprovable goal if the relation is not also reflexive.
After lifting the existential quantifier to the top, our rewriting is complete. The key property to note from the example is that the only symbols that the rewriter needed to interpret where the morphisms, e.g. respectful and pointwise. All other reasoning was justified entirely by a combination of rewriting lemmas, properness lemmas, and the reflexivity and transitivity of relations. Thus, like \(\mathcal {R}_{tac}\), our rewriter is parametric in the domain.
6.2 Implementing the Rewriter
There are two opportunities to make a custom rewriter more efficient than one implemented using \(\mathcal {R}_{tac}\) primitives. First, rewriting tends to produce a lot of unification variables as properness rules have two unification variables for every function argument, only one of which will be solved when applying the theorem. Our small example above would introduce at least 8 unification variables where in reality none are strictly necessary. Second, rewriting needs to be clever about when it appeals to transitivity.
This encoding also allows us to perform additional processing after our recursive rewrites return. If we use unification variables to return results, we need to ensure that we do not instantiate that unification variable until we are certain that we have our final result. Therefore, when we make recursive calls, we would need to generate fresh unification variables and track them. Communicating the result directly solves this problem elegantly because the rewriter can inspect the results of recursive calls before it constructs its result. The key to justifying this manipulation is that, unlike \(\mathcal {R}_{tac}\), the soundness proof of the rewriter gets a global view of the computation before it needs to provide a proof term. This gives it the flexibility to apply, or not apply, transitivity based on the entire execution of the function. That is, if multiple rewrites succeed, and the relation is transitive, then the soundness proof uses transitivity to glue the results together. If only one succeeds, there is no need to use transitivity and the proof from the recursive call is used. And if none succeed, and the relation is not reflexive then rewriting fails. It is important to note that this global view is purely a verificationtime artifact. It incurs no runtime overhead.
Another benefit of accepting the term directly is that the rewriter can perform recursion on it directly. The core of the bottomup rewriter handles the cases for the five syntactic constructs of the Open image in new window language, of which only application and abstraction are interesting. In the application case the rewriter accumulate the list of arguments delaying all of its decisions until it reaches the head symbol. In order to ensure that the rewriter can perform recursive calls on these subterms, the rewriter pairs the terms with closures representing the recursive calls on these subterms. This technique is reminiscent of hereditary substitutions and makes it quite easy to satisfy Coq’s termination checker. Abstractions are the only construct that is treated specially within the core rewriter. For abstractions, we determine whether the relation is a pointwise relation and if so, we shift the variable into the context and make a recursive call. Otherwise, we treat the abstraction as an opaque symbol.
6.3 Instantiating the Rewriter
7 Case Studies
Using \(\mathcal {R}_{tac}\), we have built several pieces of automation that perform interesting reasoning and have completely automatic proofs. Beyond these small case studies, \(\mathcal {R}_{tac}\) has also been used for automating a larger program logic [22] and we are currently using it to verify Java programs with Charge! [7].
We performed our evaluation on a 2.7Ghz Core i7 running Linux and Coq 8.5rc1. Our benchmarks compare our reflective automation to similar \(\mathcal {L}_{tac}\) automation. Unless otherwise noted, both implementations use the same algorithms and the \(\mathcal {R}_{tac}\) implementations use only the generic tactics. We benchmark two phases: proof generation and proof checking (notated by Qed). In the \(\mathcal {L}_{tac}\) implementations, proof generation is the time it takes to interpret the tactics and construct the proof object, and the Qed time is the time it takes Coq to check the final proof. Note that when using \(\mathcal {L}_{tac}\), the proof checking does not include the search necessary to find the proof. In the \(\mathcal {R}_{tac}\) implementation, proof generation counts the cost to construct the syntactic representation of the goal and perform the computation on Coq’s bytecode virtual machine [27]. During Qed time the syntactic representation is typechecked, its denotation is computed and checked to be equal to the proposition that needs to be checked, and the tactic is reexecuted to ensure that the computation is correct.
Monoidal Equivalence. Our first case study is the monoidal equivalence checker from Sect. 3. The graph in Fig. 8 shows how the \(\mathcal {R}_{tac}\) implementation scales compared to the \(\mathcal {L}_{tac}\) implementation. Despite the fact that both the \(\mathcal {R}_{tac}\) and the \(\mathcal {L}_{tac}\) automation perform exactly the same search, the \(\mathcal {R}_{tac}\) implementation scales significantly better than the \(\mathcal {L}_{tac}\) implementation. The breakeven point–where \(\mathcal {R}_{tac}\) and \(\mathcal {L}_{tac}\) are equally fast—is at roughly 8 terms where the proof size begins to increase dramatically compared to the problem size.
Postcondition Verification. We developed a simple program verifier in \(\mathcal {R}_{tac}\) for a trivial imperative programming language, which includes assignments, addition, and sequencing. All of the automation is based heavily on a simple \(\mathcal {L}_{tac}\) implementation. For example, the lemmas that we reify are the same lemmas that are applied by the \(\mathcal {L}_{tac}\) implementation. We built the automation incrementally, first automating only sequencing and copies and later integrating handling of simple reasoning about addition.
The graph in Fig. 8 compares the performance between pure \(\mathcal {L}_{tac}\) (\(\mathcal {L}_{tac}\)), pure \(\mathcal {R}_{tac}\) (\(\mathcal {R}_{tac}\)), and a hybrid of \(\mathcal {L}_{tac}\) and \(\mathcal {R}_{tac}\) (\(\mathcal {L}_{tac}\) +\(\mathcal {R}_{tac}\)) implementation. The problem instances increment each of n variables with the postcondition stating that each variable has been incremented. The x axis shows the number of variables (which is directly related to both the size of the program and the pre and postconditions) and the y axis is the verification time in seconds. There are two subproblems in the verifier: first, the postcondition needs to be computed, and second, the entailment between the computed postcondition and the stated postcondition is checked. The blue (\(\mathcal {L}_{tac}\)) line automates both subproblems in \(\mathcal {L}_{tac}\). Converting the postcondition calculation tactic to \(\mathcal {R}_{tac}\) and leaving the entailment checking to \(\mathcal {L}_{tac}\) already gets us a substantial performance improvement for larger problems, e.g. a 31x reduction in verification time for 26 variables. The red dotted line (\(\mathcal {L}_{tac}\) post) shows the amount of that time that is spent checking the final entailment in \(\mathcal {L}_{tac}\). Converting the entailment checker into a reflective procedure results in another 2.6x speedup bringing the tactic to the green line at the bottom of the graph. Overall, the translation from pure \(\mathcal {L}_{tac}\) to pure \(\mathcal {R}_{tac}\) leads to an almost 84x reduction in the total verification time from 151 seconds to less than 2. In addition to the performance improvement on the subproblems, solving the entire verification with a single reflective tactic avoids the need to leave the syntactic representation which is often accounts for a large portion of the time in reflective automation [34].
For this simple language, the entire translation from \(\mathcal {L}_{tac}\) to \(\mathcal {R}_{tac}\) took a little over a day. We encountered three main stumbling blocks in our development. First, the metatheory for our language is built on the Charge! library [8] which relies heavily on typeclasses which are not automatically handled by MirrorCore and \(\mathcal {R}_{tac}\). Second, solving the entailment requires reasoning about arithmetic. In \(\mathcal {L}_{tac}\), we can use Coq’s builtin Open image in new window tactic which discharges obligations in Presburger arithmetic. Since we are building our automation reflectively, we lose access to generateandcheckstyle automation. We solve this problem by writing custom \(\mathcal {R}_{tac}\) to discharge the regular form of the goals that we need for this example, but more robust automation is clearly preferable and would likely benefit many clients. Finally, reasoning steps in \(\mathcal {L}_{tac}\) rely on Coq’s reduction mechanism. Since computational reflection treats symbols opaquely by default we needed to write reflective unfolders which replace symbols with their implementations. This is currently a quite manual task, though we believe that it could be simplified with some additional development.
For exposition purposes we have purposefully kept the automation simple. Adding support for additional language constructs is quite simple assuming they have good reasoning principles. In general, we reify the lemma and add new arms to the Open image in new window tactic to apply them. To extend the automation to a more realistic language we need to adapt the language and the automation to support heap operations and a rich separation logic. Our initial work on this suggests that this extension crucially relies on a good separation logic entailment checker. We believe that our monoidal equivalence checker is a first step in this direction, but more work is necessary to handle abstract representation predicates and more clever solving of unification variables that represent frame conditions.
To perform lifting, we instantiate the rewriter using the definitions in Fig. 9. The core traversal is Open image in new window which uses the bottomup traversal. When the head symbol is a conjunction the recursive rewrites pull quantifiers to the top, which produces a formula similar to: \((\exists x : \mathbb {N}, P x) \wedge (\exists y : \mathbb {N}, \exists z : \mathbb {N}, Q\,y\,z)\). To lift all of the quantifiers from both sides of the conjunct, Open image in new window repeatedly performs rewriting to lift the existentials over the conjunct. This rewriting also uses the bottomup rewriter, but only uses the properness lemmas that allow the rewriter to descend into the body of existential quantifiers ( Open image in new window ) to avoid descending into terms that we know do not contain quantifiers.
Figure 9 compares our reflective rewriter with two nonreflective strategies on a problem where 10 existential quantifiers are lifted from the leaves of a tree of conjuncts. The \(\mathcal {L}_{tac}\) rewriter uses Open image in new window since Coq’s builtin Open image in new window tactic does not rewrite under binders which is necessary to complete the problem. Open image in new window uses Sozeau’s new rewriter [42] which is more customizable and produces better proof terms. Even on small instances, the reflective rewriter is faster than both \(\mathcal {L}_{tac}\) and Open image in new window , e.g. at 16 conjuncts the reflective implementation is 13x faster than the strategy rewriter and 42x faster than the \(\mathcal {L}_{tac}\) implementation. The performance improvement is due to the large proof terms that setoid rewriting requires.
8 Related and Future Work
The idea of computational reflection has been around since Nuprl [5, 30] and also arose later in LEGO [40]. Besson [9] first demonstrated the ideas in Coq reasoning about Peano arithmetic. Since then there have been a variety of procedures that use computational reflection. Braibant’s AAC tactics [12] perform reflective reasoning on associative and commutative structures such as rings. Lescuyer [33] developed a simple, reflective, SMT solver in Coq. There has also been work on reflectively checking proof traces produced by external tools in Coq [3, 10]. The development of powerful reflective tactics has been facilitated by fast reduction mechanisms in Coq [11, 27]. Our work tackles the problem of making computational reflection more accessible to ordinary users of Coq by making it easier to write and prove reflective automation sound, which is the main bottleneck in using computational reflection in practice.
Our work builds on the ideas for computational reflection developed by Malecha et al [34]. We extend that work by supporting a richer term and type language that is capable of representing higherorder problems and problems with local quantification^{4} using a formalized version of Garillot’s representation of simple types in type theory [24]. Our work differs from Garillot’s by applying the representation to build reusable reflective automation. Some of the technical content in this paper is expanded in Malecha’s dissertation [37].
Work by Keller describes an embedding of HOL lite in Coq [28] in order to transfer HOL proofs to Coq. Our representation in MirrorCore is very close to their work since, like MirrorCore, HOL lite does not support dependent types. Their work even discusses some of the challenges in compressing the terms that they import. For us, computational reflection solves the problem of large proof terms, though we did have to tune our representation to shrink proof terms. Separately, Fallenstein and Kumar [23] have applied reflection in HOL focusing on building selfevolving systems using logics based on large cardinals.
Recent work [21, 44] has built reflective tactics in the Agda proof assistant. Unlike our work, this work axiomatizes the denotation function for syntax and relies on these axioms to prove the soundness of tactics. They argue that this axiomatization is reasonable by restricting it to only reduce on values in some cases which is sufficient for computational reflection. The reason for axiomatizing the denotation function is to avoid the overwhelming complexity of embedding a language as rich as dependent type theory within itself [4, 13, 19, 38, 39, 45].
Kokke and Swierstra [31] have developed an implementation of Open image in new window in Agda using Agda’s support for computational reflection. Their work also includes a reflective implementation of unification similar to our own. Their implementation abstracts over the search strategy allowing them to support heuristic search strategies, e.g. breadth and depthfirst search. Developing their procedure using only \(\mathcal {R}_{tac}\) would be difficult because \(\mathcal {R}_{tac}\) ’s design follows \(\mathcal {L}_{tac}\) ’s model and only supports depthfirst search. However, we can still implement a custom tactic similar to our rewriting tactic that interfaces with \(\mathcal {R}_{tac}\).
Beyond computational reflection, there has been a growing body of work on proof engineering both in Coq and other proof assistants. Ziliani developed \(\mathcal {M}_{tac}\) [46], a proofgenerating tactic language that manipulates proofs explicitly. While it does generate proofs and thus is not truly reflective, it does provide a cleaner way to develop proofs. Prior to \(\mathcal {M}_{tac}\), Gonthier [26] demonstrated how to use Coq’s canonical structures to approximate disciplined proof search. Canonical structures are the workhorse of automation in the Ssreflect tactic library [25] which uses them as “small scale reflection.” Our approach is based on largescale computational reflection, seeking to integrate reflective automation to build automation capable of combining many steps of reasoning.
Escaping some of the complexity of dependent type theories, Stampoulis’s VeriML [43] provides a way to write verified tactics within a simpler type theory. Like \(\mathcal {R}_{tac}\), VeriML tactics are verified once meaning that their results do not need to be checked each time. VeriML is an entirely new proof assistant with accompanying metatheory. Our implementation of \(\mathcal {R}_{tac}\) is built directly within Coq and is being used alongside Coq’s rich dependent type theory.
Future Work. The primary limitation in MirrorCore is the fact that the term representation supports only simple types. For the time being we have been able to get around this limitation by using metalevel parameterization to represent, and reason about, type constructors, polymorphism, and dependent types. Enriching MirrorCore to support these features in a firstclass way would make it easier to write generic automation that can reason about these features, for example applying polymorphic lemmas. The primary limitation to doing this comes from the need to have decidable type checking for the term language. With simple types, decidable equality is simply syntactic equality, but when the type language contains functions type checking requires reduction.
Currently, the main cost of switching to reflective tactics is the loss of powerful tactics such as Open image in new window and Open image in new window . While pieces of these tactics are reflective, integrating them into larger developments requires that they use extensible representations. Porting these tactics to work on MirrorCore would be a step towards making them usable within \(\mathcal {R}_{tac}\). Unfortunately, many of these plugins rely on interesting OCaml programs to do the proof search and then emit witnesses that are checked reflectively. Accommodating this kind of computation within \(\mathcal {R}_{tac}\) would require native support for invoking external procedures and reconstructing the results in Coq a la Claret’s work [16].
9 Conclusion
We built a framework for easily building efficient automation in Coq using computational reflection. Our framework consists of MirrorCore, an extensible embedding of the simplytyped \(\lambda \)calculus inside of Coq, which we use as our core syntactic representation. On top of MirrorCore, we built the reflective tactic language \(\mathcal {R}_{tac}\). \(\mathcal {R}_{tac}\) is modeled on \(\mathcal {L}_{tac}\) which makes it easy to port simple \(\mathcal {L}_{tac}\) tactics directly to \(\mathcal {R}_{tac}\). Our case studies show that even naïve \(\mathcal {R}_{tac}\) implementations can be nearly 2 orders of magnitude faster than their corresponding \(\mathcal {L}_{tac}\) tactics. To demonstrate the extensible nature of \(\mathcal {R}_{tac}\), we built a bottomup setoid rewriter capable of interoperating with \(\mathcal {R}_{tac}\) tactics while retaining the ability to use custom data types and a global view of computation for efficiency.
The combination of MirrorCore and \(\mathcal {R}_{tac}\) opens the doors to larger formalisms within dependent type theories that need to construct proof objects. \(\mathcal {R}_{tac}\) allows us to combine individual reflective tactics into higherlevel automation which allows us to further amortize the reification overhead across more reasoning steps. We believe that fully reflective automation will enable more and larger applications of the rich program logics currently being developed.
Footnotes
 1.
The full code can be found in the MirrorCore distribution.
 2.
We permit illtyped terms in our representation to avoid indexing terms by their type. Indexed terms are larger which increases the time it takes to typecheck them thus slowing down the automation.
 3.
In this definition we avoid the complexities of illtyped terms. In the code, the soundness proof gets to assume that the context and goal are both welltyped.
 4.
Malecha’s work supports local quantification only in separation logic formulae and relies crucially on the type of separation logic formulae to be inhabited.
References
 1.Agda Development Team. The Agda proof assistant reference manual, version 2.4.2. Accessed 2014 (2014)Google Scholar
 2.Appel, A.W.: Program logics for certified compilers. Cambridge University Press, Cambridge (2014)CrossRefzbMATHGoogle Scholar
 3.Armand, M., Faure, G., Grégoire, B., Keller, C., Théry, L., Werner, B.: A modular integration of SAT/SMT solvers to Coq through proof witnesses. In: Jouannaud, J.P., Shao, Z. (eds.) CPP 2011. LNCS, vol. 7086, pp. 135–150. Springer, Heidelberg (2011)CrossRefGoogle Scholar
 4.Barras, B., Werner, B.: Coq in Coq. Technical report, INRIARocquencourt (1997)Google Scholar
 5.Barzilay, E.: Implementing reflection in Nuprl. Ph.D. thesis, Cornell University, AAI3195788 (2005)Google Scholar
 6.Barzilay, E., Allen, S., Constable, R.: Practical reflection in NuPRL. In: Short Paper Presented at LICS 2003, pp. 22–25, June 2003Google Scholar
 7.Bengtson, J., Jensen, J.B., Sieczkowski, F., Birkedal, L.: Verifying objectoriented programs with higherorder separation logic in Coq. In: Eekelen, M., Geuvers, H., Schmaltz, J., Wiedijk, F. (eds.) ITP 2011. LNCS, vol. 6898, pp. 22–38. Springer, Heidelberg (2011)CrossRefGoogle Scholar
 8.Bengtson, J., Jensen, J.B., Birkedal, L.: Charge! a framework for higherorder separation logic in Coq. In: Beringer, L., Felty, A. (eds.) ITP 2012. LNCS, vol. 7406, pp. 315–331. Springer, Heidelberg (2012)CrossRefGoogle Scholar
 9.Besson, F.: Fast reflexive arithmetic tactics the linear case and beyond. In: Altenkirch, T., McBride, C. (eds.) TYPES 2006. LNCS, vol. 4502, pp. 48–62. Springer, Heidelberg (2007)CrossRefGoogle Scholar
 10.Besson, F., Cornilleau, P.E., Pichardie, D.: Modular SMT proofs for fast reflexive checking inside Coq. In: Jouannaud, J.P., Shao, Z. (eds.) CPP 2011. LNCS, vol. 7086, pp. 151–166. Springer, Heidelberg (2011)CrossRefGoogle Scholar
 11.Boespflug, M., Dénès, M., Grégoire, B.: Full reduction at full throttle. In: Jouannaud, J.P., Shao, Z. (eds.) CPP 2011. LNCS, vol. 7086, pp. 362–377. Springer, Heidelberg (2011)CrossRefGoogle Scholar
 12.Braibant, T., Pous, D.: Tactics for reasoning modulo AC in Coq. In: Jouannaud, J.P., Shao, Z. (eds.) CPP 2011. LNCS, vol. 7086, pp. 167–182. Springer, Heidelberg (2011)CrossRefGoogle Scholar
 13.Chapman, J.: Type theory should eat itself. Electron. Notes Theoret. Comput. Sci. 228, 21–36 (2009)CrossRefzbMATHGoogle Scholar
 14.Chlipala, A.: Certified Programming with Dependent Types. MIT Press, Cambridge (2008)zbMATHGoogle Scholar
 15.Chlipala, A.: From network interface to multithreaded web applications: a casestudy in modular program verification. In: Proceedings of POPL 2015, pp. 609–622. ACM (2015)Google Scholar
 16.Claret, G., del Carmen González Huesca, L., RégisGianas, Y., Ziliani, B.: Lightweight proof by reflection using a posteriori simulation of effectful computation. In: Blazy, S., PaulinMohring, C., Pichardie, D. (eds.) ITP 2013. LNCS, vol. 7998, pp. 67–83. Springer, Heidelberg (2013)CrossRefGoogle Scholar
 17.Coq Development Team. The Coq proof assistant reference manual, version 8.4. Accessed 2012 (2015)Google Scholar
 18.Ricketts, D., Malecha, G., Alvarez, M.M., Gowda, V., Lerner, S.: Towards verification of hybrid systems in a foundational proof assistant. In: MEMOCODE 2015 (2015)Google Scholar
 19.Danielsson, N.A.: A formalisation of a dependently typed language as an inductiverecursive family. In: Altenkirch, T., McBride, C. (eds.) TYPES 2006. LNCS, vol. 4502, pp. 93–109. Springer, Heidelberg (2007)CrossRefGoogle Scholar
 20.Delaware, B., PitClaudel, C., Gross, J., Chlipala, A.: Fiat: deductive synthesis of abstract data types in a proof assistant. In: Proceedings of POPL 2015, pp. 689–700. ACM (2015)Google Scholar
 21.Devriese, D., Piessens, F.: Typed syntactic metaprogramming. In: Proceedings of ICFP 2013, pp. 73–86. ACM (2013)Google Scholar
 22.Dodds, J., Cao, Q., Bengtson, J., Appel, A.W.: Computational symbolic execution in interactive Hoarelogic proofs (Under submission)Google Scholar
 23.Fallenstein, B., Kumar, R.: ProofProducing Reflection for HOL. In: Urban, C., Zhang, X. (eds.) ITP 2015. LNCS, vol. 9236, pp. 170–186. Springer International Publishing, Switzerland (2015)Google Scholar
 24.Garillot, F., Werner, B.: Simple types in type theory: deep and shallow encodings. In: Schneider, K., Brandt, J. (eds.) TPHOLs 2007. LNCS, vol. 4732, pp. 368–382. Springer, Heidelberg (2007)CrossRefGoogle Scholar
 25.Gonthier, G., Mahboubi, A., Tassi, E.: A small scale reflection extension for the Coq system. Rapport de recherche RR6455, INRIA (2008)Google Scholar
 26.Gonthier, G., Ziliani, B., Nanevski, A., Dreyer, D.: How to make Ad Hoc proof automation less Ad Hoc. In: Proceedings of ICFP 2011, pp. 163–175. ACM (2011)Google Scholar
 27.Grégoire, B., Leroy, X.: A compiled implementation of strong reduction. In: Proceedings of ICFP 2002, pp. 235–246. ACM (2002)Google Scholar
 28.Keller, C., Werner, B.: Importing HOL light into Coq. In: Kaufmann, M., Paulson, L.C. (eds.) ITP 2010. LNCS, vol. 6172, pp. 307–322. Springer, Heidelberg (2010)CrossRefGoogle Scholar
 29.Klein, G., Elphinstone, K., Heiser, G., Andronick, J., Cock, D., Derrin, P., Elkaduwe, D., Engelhardt, K., Kolanski, R., Norrish, M., Sewell, T., Tuch, H., Winwood, S.: seL4: formal verification of an OS kernel. In: Proceedings of SOSP, pp. 207–220. ACM (2009)Google Scholar
 30.Knoblock, T.B., Constable, R.L.: Formalized metareasoning in type theory. Technical report, Cornell University (1986)Google Scholar
 31.Kokke, P., Swierstra, W.: Auto in Agda. In: Hinze, R., Voigtländer, J. (eds.) MPC 2015. LNCS, vol. 9129, pp. 276–301. Springer, Heidelberg (2015)CrossRefGoogle Scholar
 32.Leroy, X.: Formal certification of a compiler backend, or: programming a compiler with a proof assistant. In: 33rd ACM symposium on Principles of Programming Languages, pp. 42–54. ACM Press (2006)Google Scholar
 33.Lescuyer, S.: Formalisation et développement d’une tactique réflexive pourla démonstration automatique en Coq. Thèse de doctorat, Université ParisSud, January 2011Google Scholar
 34.Malecha, G., Chlipala, A., Braibant, T.: Compositional computational reflection. In: Klein, G., Gamboa, R. (eds.) ITP 2014. LNCS, vol. 8558, pp. 374–389. Springer, Heidelberg (2014)Google Scholar
 35.Malecha, G., Morrisett, G., Shinnar, A., Wisnesky, R.: Toward a verified relational database management system. In: Proceedings of POPL 2010, pp. 237–248. ACM (2010)Google Scholar
 36.Malecha, G., Wisnesky, R.: Using dependent types and tactics to enable semantic optimization of languageintegrated queries. In: Proceedings of DBPL 2015, pp. 49–58. ACM (2015)Google Scholar
 37.Malecha, G.M.: Extensible proof engineering in intensional type theory. Ph.D. thesis, Harvard University, November 2014Google Scholar
 38.McBride, C.: Outrageous but meaningful coincidences: dependent typesafe syntax and evaluation. In: Proceedings of WGP 2010, pp. 1–12. ACM (2010)Google Scholar
 39.Mike, S.: Homotopy type theory should eat itself (but so far, it’s too big to swallow), March 2014Google Scholar
 40.Pollack, R.: On extensibility of proof checkers. In: Smith, J., Dybjer, P., Nordström, B. (eds.) TYPES 1994. LNCS, vol. 996, pp. 140–161. Springer, Heidelberg (1995)CrossRefGoogle Scholar
 41.Shao, Z.: Cleanslate development of certified OS kernels. In: Proceedings of CPP 2015, pp. 95–96. ACM (2015)Google Scholar
 42.Sozeau, M.: Proofrelevant rewriting strategies in Coq. In: At Coq Workshop, July 2014Google Scholar
 43.Stampoulis, A., Shao, Z.: VeriML: typed computation of logical terms inside a language with effects. In: Proceedings of ICFP, pp. 333–344. ACM (2010)Google Scholar
 44.van der Walt, P., Swierstra, W.: Engineering proof by reflection in Agda. In: Hinze, R. (ed.) IFL 2012. LNCS, vol. 8241, pp. 157–173. Springer, Heidelberg (2013)CrossRefGoogle Scholar
 45.Werner, B.: Sets in types, types in sets. In: Ito, T., Abadi, M. (eds.) TACS 1997. LNCS, vol. 1281, pp. 530–546. Springer, Heidelberg (1997)CrossRefGoogle Scholar
 46.Ziliani, B., Dreyer, D., Krishnaswami, N.R., Nanevski, A., Vafeiadis, V.: Mtac: a monad for typed tactic programming in Coq. In: Proceedings of ICFP 2013, pp. 87–100. ACM (2013)Google Scholar