On the Generation of Quantified Lemmas
 92 Downloads
Abstract
In this paper we present an algorithmic method of lemma introduction. Given a proof in predicate logic with equality the algorithm is capable of introducing several universal lemmas. The method is based on an inversion of Gentzen’s cutelimination method for sequent calculus. The first step consists of the computation of a compact representation (a socalled decomposition) of Herbrand instances in a cutfree proof. Given a decomposition the problem of computing the corresponding lemmas is reduced to the solution of a secondorder unification problem (the solution conditions). It is shown that that there is always a solution of the solution conditions, the canonical solution. This solution yields a sequence of lemmas and, finally, a proof based on these lemmas. Various techniques are developed to simplify the canonical solution resulting in a reduction of proof complexity. Moreover, the paper contains a comprehensive empirical evaluation of the implemented method and gives an application to a mathematical proof.
Keywords
Cutintroduction Herbrand’s theorem Proof theory Lemma generation The resolution calculus1 Introduction
Computergenerated proofs are typically analytic, i.e. they only contain logical material that also appears in the theorem proved. This is due to the fact that analytic proof systems have a much smaller search space which makes proofsearch practically feasible. In the case of sequent calculi, proofsearch procedures typically work on the cutfree fragment. Resolution is also essentially analytic as resolution proofs do not contain complex lemmas. An important property of nonanalytic proofs is their considerably smaller length. The exact difference depends on the logic (or theory) under consideration, but it is typically enormous. In (classical and intuitionistic) firstorder logic there are proofs with cut of length n whose theorems have only cutfree proofs of length \(2_n\) (where \(2_0 = 1\) and \(2_{n+1}=2^{2_n}\)), see [24, 27, 32]. In contrast, proofs formalized by humans are almost never analytic. Human insight and understanding of a mathematical situation is manifested in the use of concepts, as well as properties of, and relations among, concepts in the form of lemmas. This leads to a highlevel structure of a proof. For these two reasons, their length and the insight they (can) contain, we consider the generation of nonanalytic proofs an aim of high importance to automated deduction.
There is another, more theoretical, motivation for studying cutintroduction which derives from the foundations of mathematics: most of the central mathematical notions have developed from the observation that many proofs share common structures and steps of reasoning. Encapsulating those leads to a new abstract notion, like that of a group or a vector space. Such a notion then builds the base for a whole new theory whose importance stems from the pervasiveness of its basic notions in mathematics. From a logical point of view this corresponds to the introduction of cuts into an existing proof database. While the introduction of these notions can certainly be justified from a pragmatic point of view since it leads to natural and concise presentations of mathematical theories, the question remains whether they can be justified on more fundamental grounds as well. In particular, the question remains whether the notions at hand provide for an optimal compression of the proofs under consideration. A cutintroduction method based on such quantitative aspects (as the one described in this paper) has the potential to answer such questions, see Sect. 6.1 for a case study.
Work on cutintroduction can be found at a number of different places in the literature. Closest to our work are other approaches which aim to abbreviate or structure a given input proof: [35] is an algorithm for the introduction of atomic cuts that is capable of exponential proof compression. The method [15] for propositional logic is shown to never increase the size of proofs more than polynomially. Another approach to the compression of firstorder proofs by introduction of definitions for abbreviating terms is [34].
Viewed from a broader perspective, this paper should be considered part of a large body of work on the generation of nonanalytic formulas that has been carried out by numerous researchers in various communities. Methods for lemma generation are of crucial importance in inductive theorem proving, which frequently requires generalization [7], see e.g. [20] for a method in the context of rippling [8] that is based on failed proof attempts. In automated theory formation [10, 11], an eager approach to lemma generation is adopted. This work has, for example, led to automated classification results of isomorphism classes [30] and isotopy classes [31] in finite algebra. See also [21] for an approach to inductive theory formation. In pure proof theory, an important related topic is Kreisel’s conjecture on the generalization of proofs, see [9]. Based on methods developed in this tradition, [5] describes an approach to cutintroduction by filling a proof skeleton, i.e. an abstract proof structure, obtained by an inversion of Gentzen’s procedure with formulas in order to obtain a proof with cuts. The use of cuts for structuring and abbreviating proofs is also of relevance in logic programming: [23] shows how to use focusing in order to avoid proving atomic subgoals twice, resulting in a proof with atomic cuts.
Our previous work in this direction has started with [19] where we presented a basic algorithm for the introduction of a single cut with a single universal quantifier in pure firstorder logic. In [17] we have made the method practically applicable by extending it to compute a \(\varPi _1\)cut with an arbitrary number of quantifiers and by working modulo equality. In [17] we have also presented and evaluated an implementation. The method has been further extended on a prooftheoretic level to the introduction of an arbitrary number of \(\varPi _1\)cuts with one quantifier each in [18] which already allows for an exponential compression.
In this paper we extend the method to predicate logic with equality and to the introduction of an arbitrary number of \(\varPi _1\)cuts, each of which has an arbitrary number of quantifiers. We present an implementation based on a new (and efficient) algorithm for computing a decomposition of a Herbranddisjunction [12]. We carry out a comprehensive empirical evaluation of the implementation and describe a case study demonstrating how our algorithm generates the notion of a partial order from a proof about a lattice. This paper thus completes the theory and implementation of our method for the introduction of \(\varPi _1\)cuts.
The paper is organized in the same order as the steps of our algorithm. In Sect. 2, we recall basic notions and results about proofs, as well as the extraction of Herbrand sequents and how to encode them as term sets. Sect. 3 is devoted to the computation of decompositions of those term sets. Then in Sect. 4, we describe how to compute canonical cut formulas induced by the decomposition. We present several techniques to simplify those canonical cut formulas in Sect. 5. At the end, we describe our implementation and experiments in Sect. 6.
2 Proofs and Herbrand Sequents
Throughout this paper we consider predicate logic with equality. We typically use the names a, b, c for constants, f, g, h for functions, \(x,y,z, \alpha \) for variables, \(\varGamma \) and \(\varDelta \) for sets for formulas, and \(\mathcal {S}\) for sequents. We write sequents in the form \(\mathcal {S}:A_1,\ldots ,A_n \rightarrow B_1,\ldots ,B_m\) where \(\mathcal {S}\) is interpreted as the formula \((A_1 \wedge \cdots \wedge A_n) \supset (B_1 \vee \cdots \vee B_m)\). For convenience we write a substitution \([x_1 \backslash t_1, \ldots , x_n \backslash t_n]\) in the form \( [ \overline{x} \backslash \overline{t} ] \) for \(\overline{x} = (x_1,\ldots ,x_n)\) and \(\overline{t} = (t_1,\ldots ,t_n)\).
A strong quantifier is a \(\forall \) (\(\exists \)) quantifier with positive (negative) polarity. The logical complexity \({\mathcal {S}}_l\) of a sequent \(\mathcal {S}\) is the number of propositional connectives, quantifiers and atoms \(\mathcal {S}\) contains. We restrict our investigations to endsequents in prenex form without strong quantifiers.
Definition 1
Note that the restriction to \(\varSigma _1\)sequents does not constitute a substantial restriction as one can transform every sequent into a validityequivalent \(\varSigma _1\)sequent by Skolemization and prenexing.
Definition 2
A sequent \(\mathcal {S}\) is called Evalid if it is valid in predicate logic with equality; \(\mathcal {S}\) is called a quasitautology [29] if \(\mathcal {S}\) is quantifierfree and Evalid.
We use \(\models \) for the semantic consequence relation in predicate logic with equality.
Definition 3
The length of a proof \(\varphi \), denoted by \({\varphi }\), is defined as the number of inferences in \(\varphi \). The quantifier complexity of \(\varphi \), written as \({\varphi }_q\), is the number of weak quantifier inferences in \(\varphi \).
2.1 Extraction of Terms
Herbrand sequents of a sequent \(\mathcal {S}\) are sequents consisting of instantiations of \(\mathcal {S}\) which are quasitautologies. The formal definition is:
Definition 4
Note that, in the instantiation complexity of a Herbrand sequent, we count the formulas weighted by the number of their quantifiers. Formulas in \(\mathcal {S}\) without quantifiers are represented by empty tuples in the Herbrand structure (e.g. \(H_i = \{ () \}\)), and do not affect the instantiation complexity as they are weighted by 0.
Example 5
Theorem 6
(Midsequent theorem) Let \(\mathcal {S}\) be a \(\varSigma _1\)sequent and \(\pi \) a cutfree proof of \(\mathcal {S}\). Then there is a Herbrand sequent \(\mathcal {S}^*\) of \(\mathcal {S}\) s.t. \({\mathcal {S}^*}_i \le {\pi }_q {\mathcal {S}}_l\).
Proof
This result is proven in [16] (section IV, theorem 2.1) for \(\mathbf {LK}\), but the proof for \(\mathbf {LK}_=\) is basically the same. By permuting the inference rules, one obtains a proof \(\pi '\) from \(\pi \) which has an upper part containing only propositional inferences and the equality rules (which can be shifted upwards until they are applied to atoms only) and a lower part containing only quantifier inferences. The sequent between these parts is called midsequent and has the desired properties. \(\square \)
\(\mathcal {S}^*\) can be obtained by tracing the introduction of quantifiers in the proof, which for every formula \(Q \bar{x}_i. F_i\) in the sequent (where \(Q\in \{\forall ,\exists \}\)) yields a set of term tuples \(H_i\), and then computing the sets of formulas \({\mathcal {F}}_i\).
The algorithm for introducing cuts described here relies on computing a compressed representation of a Herbrand structure, which is explained in Sect. 3. Note, though, that the Herbrand structure \((H_1, \ldots , H_q)\) is a list of sets of term tuples (i.e. each \(H_i\) is a set of tuples \(\overline{t}\) used to instantiate the formula \(F_i\)). In order to facilitate computation and representation, we will add to the language fresh function symbols \(f_1, \ldots , f_q\). Each \(f_i\) will be applied to the tuples of the set \(H_i\), therefore encoding a list of sets of tuples into a set of terms. In this new set, each term will have some \(f_i\) as its head symbol that indicates to which formula the arguments of \(f_i\) belong.
Definition 7
Let \(\mathcal S\) be a \(\varSigma _1\)sequent as in Definition 1 and let \(f_1, \dots , f_q\) be fresh function symbols. We then say that the term \(f_i(\overline{t})\) encodes the instance \(F_i[\overline{x} \backslash \overline{t}]\). Terms of the form \(f_i(\overline{t})\) for some i are called decodable.
We refer to the encoded Herbrand structure as the term set of a proof. Conversely, such a term set defines a Herbrand structure and thus a Herbrand sequent.
Example 8
3 Computing Decompositions
Computing a compact representation of the Herbrand structure of a cutfree proof is the first step in our lemma introduction algorithm. This is accomplished by computing decompositions of a proof’s term set.
Definition 9

\(\overline{\alpha _i}\) is a vector of variables of size \(n_i\);

all of the variables \(\overline{\alpha _i}_j\) are pairwise different;

U is a finite set of terms which can contain all variables from \(\overline{\alpha _1}, ..., \overline{\alpha _k}\)

\(S_i\) is a finite set of term vectors of size \(n_i\);

the terms in \(S_i\)’s vectors may only contain the variables from \(\overline{\alpha _{i+1}}, ..., \overline{\alpha _k}\) (consequently, \(S_k\) contains only ground terms).
Note that in the definition of covering above, we only require that L(D) is a superset of T and not that it is equal to T. This requirement is motivated by a property of Herbrand sequents: every supersequent of a Herbrand sequent is a Herbrand sequent as well. This relaxed requirement allows us to consider more decompositions for a given term set, and hence obtain a stronger compression. We aim to find a decomposition of minimal size that covers a given term set T.
The notion of decomposition in Definition 9 is stated purely on the level of formal languages, without any references to proofs or formulas or Herbrand sequents. The algorithms we present in this section will likewise not be concerned with proofs, and compute decompositions based purely on the set of terms they get as input. Unfortunately not all decompositions can be decoded into quantifier instances of a proof with cut—however a very slight restriction on the decomposition suffices to ensure that this is nevertheless possible:
Definition 10
Let \(\mathcal {S}\) be a \(\varSigma _1\)sequent. Then a decomposition \(D = U \circ _{\overline{\alpha _1}} S_1 \circ _{\overline{\alpha _2}} \dots \circ _{\overline{\alpha _k}} S_k \) is called decodable (for \(\mathcal {S}\)) iff every term \(u \in U\) is decodable.
Recall that a term u is decodable iff it is of the form \(f_i(\overline{t})\) for some i. Regarding only the size of the decomposition, this is no restriction at all. We can always transform a nondecodable decomposition into a decodable one, without increasing its size. The crucial property here is that the function symbols \(f_i\) only appear as the root symbols of terms in T, and not inside the terms.
Lemma 11
Let T be a finite set of ground terms, and D a decomposition of T. Then there exists a decomposition \(D'\) of T such that \(D' \le D\) and \(n_i = 1\) for all i.
Proof
We replace every \({} \circ _{\overline{\alpha _i}} S_i\) in the decomposition with \({} \circ _{\alpha _{i,1}} \pi _1(S_i) \cdots \circ _{\alpha _{i,n_i}} \pi _{n_i}(S_i)\), where \(\pi _m\) is the mth projection. That is, instead of substituting all the variables in \(\overline{\alpha _i}\) at once, we substitute them one by one. The size of decomposition does not increase since we multiplied the number of elements in \(S_i\) by \(n_i\). \(\square \)
Lemma 12
Let \(\pi \) be a cutfree proof, T its term set, and D a decomposition of T. Then there exists a decodable decomposition \(D'\) of T such that \(D' \le D\).
Proof
Without loss of generality assume \(n_i = 1\) for all i, and let \(U \circ _{\alpha _1} S_1 \circ _{\alpha _2} \dots \circ _{\alpha _k} S_k = D\). Define \(U' = \{ u \in U \cup S_1 \cup \cdots \cup S_k \mid \exists i\, u = f_i(\dots ) \}\), \(S'_i = S_i \setminus U'\) for \(1 \le i \le k\), and set \(D' = U' \circ _{\alpha _1} S'_1 \circ _{\alpha _2} \dots \circ _{\alpha _k} S'_k\), leaving out any \(S'_i\) where \(S'_i = \emptyset \). The size of the decomposition does not increase with this transformation. We need to show that \(L(D) \supseteq T\), so let \(t = u[\alpha _1 \backslash s_1, \dots , \alpha _k \backslash s_k] \in T\) where \(u \in U\), and \(s_i \in S_i\) for all i. If u is of the form \(f_i(\dots )\), then any \(s_i\) such that \(s_i \in U'\) is irrelevant since it does not contribute to t. If \(S_i \setminus U' = \emptyset \), then we leave them out, otherwise we replace them by an arbitrary other \(s_i' \in S'_i\). On the other hand, if \(u = \alpha _j\) for some j, then we change u to \(u' = s_j \in U'\) and leave out or replace the remaining \(s_i\) as before. \(\square \)
We can also formulate the problem of finding a minimal decomposition as a decision problem: given a finite set of ground terms T and \(m \ge 0\), is there a decomposition D of T such that \(D \le m\)? This problem is in NP: given a decomposition D, and for every term \(t \in T\) the necessary substitutions, we can check in polynomial time whether the language covers T. We conjecture that the problem is NPhard as well.
Definition 13
An algorithm to produce decompositions takes as input a finite set of ground terms T, and returns a decomposition \(D = U \circ _{\overline{\alpha _1}} S_1 \circ _{\overline{\alpha _2}} \dots \circ _{\overline{\alpha _k}} S_k\) of T. Such an algorithm is called complete iff it always returns a decomposition of minimal possible size.
We will now present an incomplete but practically feasible solution to find small decompositions for a term set. Our algorithm is based on an operation called \(\varDelta \)vector. Intuitively, it computes “greedy” decompositions \(U \circ S\) with only one element in the set U. We will call those simple decompositions. They are stored in a data structure called \(\varDelta \)table, which is later processed for combining simple decompositions into more complex ones.
A previous version of this algorithm was presented in [17]. Since then, we have identified its source of incompleteness and implemented the socalled row merging heuristic for finding more decompositions. Additionally, many bugs in the implementation were fixed.
3.1 The \(\varDelta \)vector
Definition 14
Let T be a finite, nonempty set of terms, u a term and S a set of substitutions. Then (u, S) is a simple decomposition of T iff \(uS = \{ u\sigma \mid \sigma \in S \} = T\). Additionally, (u, S) is called trivial iff u is a variable.
Example 15
Let \(T = \{ f(c,c), f(d,d) \}\). Then \((f(\alpha ,\alpha ), \{ [\alpha \backslash c], [\alpha \backslash d] \})\) is a simple decomposition of T. Another decomposition of T is \((\alpha , \{ [\alpha \backslash f(c,c)], [\alpha \backslash f(d,d)] \})\), which is simple and trivial.
Given a nonempty subset \(T' \subseteq T\), the \(\varDelta \)vector for \(T'\) produces a simple decomposition of \(T'\); we write \(\varDelta (T') = (u, S)\). This term u is computed via least general generalization, a concept introduced independently in [25, 26] and [28]. The least general generalization of two terms is computed recursively:
Definition 16
Example 17
Let f, a, and b be constants, and g a binary function symbol, then \(\mathrm {lgg}(f,g(a,b)) = \alpha _1\), \(\mathrm {lgg}(g(a,b),g(b,a)) = g(\alpha _1, \alpha _2)\), \(\mathrm {lgg}(g(a,b),g(a,a)) = g(a, \alpha _1)\), and \(\mathrm {lgg}(g(a,a),g(b,b)) = g(\alpha _1, \alpha _1)\).
To have a canonical result term, we use the names \(\alpha _1, \dots , \alpha _n\) for the variables in \(\mathrm {lgg}(t,s)\), read lefttoright. The \(\mathrm {lgg}\) subsumes each of the arguments: given terms t and s, there always exist substitutions \(\sigma \) and \(\tau \) such that \(\mathrm {lgg}(t,s) \sigma = t\) and \(\mathrm {lgg}(t,s) \tau = s\). The \(\mathrm {lgg}\) operation is associative and commutative as well, and we can naturally extend it to finite nonempty sets of terms:
Definition 18
Example 19
Let a and b be constants, f a unary and g a binary function symbol, then \(\mathrm {lgg}\{f(a)\} = f(a)\), \(\mathrm {lgg}\{f(a), f(b)\} = f(\alpha _1)\), and \(\mathrm {lgg}\{f(a),f(b),g\} = \alpha _1\). Additionally let \(l = \mathrm {lgg}\{g(a,a), g(b,b), g(f(b),f(b))\} = g(\alpha _1, \alpha _1)\}\), then l subsumes each of the three arguments: \(l [\alpha _1 \backslash a] = g(a,a)\), \(l [\alpha _1 \backslash b] = g(b,b)\), \(l [\alpha _1 \backslash f(b)] = g(f(b),f(b))\),
Just as in the binary case, the \(\mathrm {lgg}\) always subsumes it arguments: for each \(t \in T'\), there exists a substitution \(\sigma _t\) such that \(\mathrm {lgg}(T') \sigma _t = t\). We can now define the \(\varDelta \)vectors in terms of the \(\mathrm {lgg}\):
Definition 20
Let \(T'\) be a finite, nonempty, set of ground terms. Then we define its \(\varDelta \)vector as \(\varDelta (T') = (\mathrm {lgg}(T'), \{ \sigma _t \mid t \in T' \})\), where for every \(t \in T'\) the substitution \(\sigma _t\) satisfies \(\mathrm {lgg}(T') \sigma _t = t\).
Example 21
Let \(T' = \{ f(c,c), f(d,d) \}\), then \(\varDelta (T') = (f(\alpha _1, \alpha _1), \{ [\alpha _1\backslash c], [\alpha _1\backslash d] \})\).
Algebraically, we can consider the set of terms where we identify terms up to variable renaming. This set is partially ordered by subsumption and the \(\mathrm {lgg}\) computes the meet operation, making it a meetsemilattice. In this semilattice, terms have a least upper bound iff they are unifiable; the join operation is given by most general unification.
The subset of terms with at most one variable is such a semilattice as well: every pair of two terms has a greatest lower bound. From this point of view, we can also define a function \(\mathrm {lgg}_1\) as the meet operation in the subsemilattice of terms with at most one variable. We may then define a variant \(\varDelta _1(T) = (u,S)\) of the \(\varDelta \)vector, where u may contain only a single variable. We will compare both variants of the \(\varDelta \)vector in the largescale experiments in Sect. 6.2.
3.2 The \(\varDelta \)table
The \(\varDelta \)table is a datastructure that stores all nontrivial \(\varDelta \)vectors of subsets of T, indexed by their sets of substitutions. Some of these simple decompositions are later combined into a decomposition of T.
Definition 22
A \(\varDelta \)row is a pair \(S \rightarrow U\) where S is a set of substitutions, and U is a set of pairs (u, T) such that \(uS = T\). A \(\varDelta \)table is a map where every keyvalue pair is a \(\varDelta \)row.
Algorithm 1 computes a \(\varDelta \)table containing the \(\varDelta \)vectors for all subsets of T. As an optimization, we do not iterate over all subsets. Instead we incrementally add terms to the subset, stopping as soon as the \(\varDelta \)vector is trivial. This optimization is justified by the following lemma:
Theorem 23
Let T be a set of terms. If \(\varDelta (T)\) is trivial, then so is \(\varDelta (T')\) for every \(T' \supseteq T\).
Proof
Let \(\varDelta (T) = (u,S)\) and \(\varDelta (T') = (u',S')\). By the subsumption property of the \(\mathrm {lgg}\), there is a substitution \(\sigma \) such that \(u'\sigma = u\). So if u is a variable, then \(u'\) is necessarily a variable as well. \(\square \)
After having computed the \(\varDelta \)table, we need to combine the simple decompositions to find a suitable one, i.e., generating the full set T. Since we did not add trivial decompositions, each row of the \(\varDelta \)table is completed with the pairs \((t, \{t\})\) for every \(t \in T\) as a postprocessing step.
Let \(S \rightarrow [(u_1, T_1), ..., (u_r, T_r)]\) be one entry of T’s \(\varDelta \)table. We know that \(T_i \subseteq T\) and that \(\{u_i\} \circ S\) is a decomposition of \(T_i\) for each \(i \in \{1 ... r\}\). Take \(\{T_{i_1}, ..., T_{i_s}\} \subseteq \{T_1, ..., T_r\}\) such that \(T_{i_1} \cup ... \cup T_{i_s} = T\). Then, since combining each \(u_{i_j}\) with S yields \(T_{i_j}\), and the union of these terms is T, the decomposition \(\{ u_{i_1}, ..., u_{i_s} \} \circ S\) will generate all terms from T. Observe that the vector of variables \(\overline{\alpha }\) used will be the same for all combined decompositions, since they share the same set S.
There might be several subsets of \(\{T_1, ..., T_r\}\) that cover T, so different decompositions can be found. For our purposes, only the minimal ones are considered. In the end, the \(\varDelta \)table algorithm produces a decomposition D of T. If T was the term set of a proof, then D is even decodable:
Lemma 24
Let \(\pi \) be a cutfree proof, T its term set, and \(D = U \circ _{} S\) the decomposition produced by the \(\varDelta \)table algorithm. Then D is decodable.
Proof
The \(\varDelta \)table only contains nontrivial simple decompositions \((u,S')\) where u is the \(\mathrm {lgg}\) of a subset of T. Such a u is necessarily of the form \(f_i(\dots )\), and hence all \(u \in U\) are as well. \(\square \)
Decompositions with \(k > 1\) The algorithm shown (and implemented in GAPT, see Sect. 6) computes only decompositions of the shape \(U \circ _{\overline{\alpha }} S\), i.e., with \(k=1\) (see Definition 9). In order to generate more general decompositions, we would have to run it again on the set U, treating all variables in \(\overline{\alpha }\) as constants.
The experiments with the simpler algorithm have given satisfying results so far, even when compared to another approach which finds more general decompositions (see Sect. 3.4). We have thus decided to postpone the analysis and implementation of an iterated \(\varDelta \)table method.
3.3 Incompleteness and RowMerging
Definition 25
Lemma 26
(rowmerging) Let \(S_1 \rightarrow R_1\) and \(S_2 \rightarrow R_2\) be \(\varDelta \)rows, and \(S_1 \preceq S_2\) with the substitution \(\sigma \) witnessing this subsumption. Then \(S_2 \rightarrow (R_2 \cup R_1\sigma )\) is a \(\varDelta \)row as well.
Proof
Let \((u,T') \in R_1\). We need to show that \(u\sigma S_2 = T'\). But this follows from \(u S_1 = T'\) since \(S_1 \preceq S_2\) via \(\sigma \). \(\square \)
3.4 The MaxSATAlgorithm
In [12], the authors propose an algorithm for the compression of a finite set of terms by reducing the problem (in polynomial time) to MaxSAT. This is another method for finding a decomposition. The difference to the \(\varDelta \)table algorithm is that one must provide the numbers k and \(\overline{\alpha _1}, ..., \overline{\alpha _k}\) in advance.
Using the reduction to MaxSAT to find decompositions is, in principle, a complete algorithm, meaning that it finds all decompositions in the shape specified by the parameters. But this requires finding all possible solutions for the generated MaxSAT problems. In addition, due to the number of variables in the generated problem, it is hardly feasible to find decompositions for \(k > 2\).
Given the limitations of both algorithms, their practical performance in terms of compressing proofs is comparable. Having both implementations is justified since the methods find different decompositions and therefore generate different cut formulas.
4 Computing Cut Formulas
After having computed a decomposition \(U \circ S_1 \circ ... \circ S_n\) as described in Sect. 3, the next step is computing cut formulas based on that decomposition. A decomposition D specifies the instances of quantifier blocks in a proof with \(\forall \)cuts (both for endsequent and cut formulas), but does not contain information about the propositional structure of the cut formulas to be constructed.
The set U in the decomposition corresponds to the instances of formulas in the endsequent, the sequent \(\mathcal {S}_U\) in the following Definition 27 consists precisely of these instances. The sequents \(\mathcal {S}_U^i\) will simplify the definition of the proof with cut—the definition of \(\mathcal {S}_U^i\) is motivated by the eigenvariable condition: the instances in \(\mathcal {S}_U^i\) are precisely those which may occur at a point where the eigenvariables \(\overline{\alpha _1},\dots ,\overline{\alpha _i}\) have been introduced below.
Definition 27
Let \(\mathcal {S}\) be a \(\varSigma _1\)sequent and \(F_i, k_i\) as in Definition 1, and \(D = U \circ _{\overline{\alpha _1}} S_1 \circ _{\overline{\alpha _2}} \cdots \circ _{\overline{\alpha _n}} S_n\) a decodable decomposition. We define the sequent \(\mathcal {S}_U = {\mathcal {F}}_{U,1}, \ldots , {\mathcal {F}}_{U,p} \rightarrow {\mathcal {F}}_{U,p+1}, \ldots , {\mathcal {F}}_{U,q}\), where \( {\mathcal {F}}_{U,i} = \{ F_i [ \overline{x_i} \backslash \overline{t} ] \mid f_i(\overline{t}) \in U \} \).
In addition, we define for every \(0 \le j \le n\) the sequent \(\mathcal {S}_U^j\) as follows: \(\mathcal {S}_U^j\) consists of all formulas \(F \in \mathcal {S}_U\) such that the free variables of F are included in \(\overline{\alpha _j},\ldots ,\overline{\alpha _n}\).
Example 28
Given a decomposition, it may be impossible to incorporate some formulas as cut formulas in a proof with the quantifier inferences indicated by the decomposition. For example, in most cases we will not be able to use \(\forall \alpha _1. \bot \) as a cut formula. Definition 30 states the precise conditions under which given formulas are usable as cut formulas. These conditions are also precisely the necessary conditions that will later on allow us to build a proof with these formulas as cuts.
Definition 29
Let \(\mathcal {S}= \varGamma \rightarrow \varDelta \) and \(\mathcal {T} = \varSigma \rightarrow \varPi \) be sequents. Then the sequent \(\mathcal {S}\circ \mathcal {T} = \varGamma , \varSigma \rightarrow \varDelta , \varPi \) is called the composition of \(\mathcal {S}\) and \(\mathcal {T}\).
Definition 30
Example 31
We can now proceed to give a definition of the proof with cut induced by a decomposition and a solution.
Definition 32
The construction in Definition 32 is clearly a proof in LK ending in \(\mathcal {S}\). The quantifier complexity \({\pi _{D,F}}_q\) is bounded by \({\mathcal {S}}_l U + \sum ^n_{i=1}a_i S_i\), where \(a_i\) is the length of the vector \(\overline{\alpha _i}\).
Example 33
The question remains whether every decomposition has a solution; we show below that this is indeed the case if the sequent defined by the term set of the decomposition is quasitautological. The main ingredient in this proof is the definition of a canonical substitution, which will turn out to be a solution in Theorem 35: the canonical substitution consists of formulas \(C_i\), such that \(C_i\) captures the maximum amount of logical information from the axioms that is available above the ith cut.
Definition 34
We will now show that the canonical substitution is, in fact, a solution.
Theorem 35
Let \(\mathcal {S}\) be a valid \(\varSigma _1\)sequent and D be a decodable decomposition for some Herbrand sequent \(\mathcal {S}^*\) of \(\mathcal {S}\). Then the canonical substitution \(\sigma \) is a solution.
Proof
First note that the variable condition is fulfilled as the free variables of \(C_i\) are included in \(\{ \overline{\alpha _i},\ldots , \overline{\alpha _n} \}\). We now need to check that each of the sequents \(I_i\sigma \) is quasitautological. Consider first \(I_0\sigma = \mathcal {S}_U^1 \circ (\rightarrow C_1, \ldots , C_n)\). Since \(\mathcal {S}_U^1 = \mathcal {S}_U\) and \(C_1 = \lnot \mathcal {S}_U\), we only need to observe that \(\mathcal {S}_U \circ (\rightarrow \lnot \mathcal {S}_U)\) is quasitautological.
For \(0 < i \le n\) and \(I_i\sigma = \mathcal {S}_U^i \circ (\{C_i \bar{w}_j^i, 1 \le j \le k_i\} \rightarrow C_{i+1}, \ldots , C_n)\), we see that \(\{C_i \bar{w}_j^i, 1 \le j \le k_i\}\) is equivalent to \(C_{i+1}\), and only need to show that \(\mathcal {S}_U^i \circ (C_{i+1} \rightarrow C_{i+1}, \ldots , C_n)\) is quasitautological, which is clear in the case \(i < n\). For \(i = n\) it suffices to show that \(C_{n+1} \rightarrow \) is quasitautology: this is true since the sequent defined by the term set of D is quasitautological, and \(C_{n+1} \rightarrow \) is logically equivalent to the Herbrand sequent represented by the term set. \(\square \)
Example 36
5 Improving the Solution
After completing the first phase of cutintroduction, namely the computation of a decomposition, the next step is to find a solution to the schematic extended Herbrand sequent induced by the decomposition. Such a solution is guaranteed to exist by Theorem 35, and its construction is described in Definition 34. But is this solution optimal? The canonical solution as defined in Sect. 4 is relatively large, in general even exponential in the size of the decomposition. As a first step towards a smaller solution, we consider a slightly less elegant version of the canonical solution with lower logical complexity:
Definition 37
The “regular” canonical solution introduces all instances immediately in \(C_1\). By contrast, the modified canonical solution introduces instances as late as possible. Purely propositional instances are never included.
Theorem 38
Let \(\mathcal {S}\) be a \(\varSigma _1\)sequent and D be a decodable decomposition for some Herbrand sequent \(\mathcal {S}^*\) of \(\mathcal {S}\). Then the modified canonical substitution \(\sigma '\) is a solution.
Proof
Similar to the proof of Theorem 35. \(\square \)
If we approach the question of optimality from the point of view of the \(\cdot _\mathrm {q}\) measure, then all solutions can be considered equivalent. From the point of view of symbolic complexity or logical complexity, things may be different: there are cases where the canonical solution is large, but small solutions exist. The following example exhibits such a case. In this example, a smaller solution not only exists, but is also more natural than (and hence in many applications preferable to) the canonical solution.
Example 39
Since the solution for the schematic extended Herbrand sequent is interpreted as the lemmata that give rise to the proof with cuts, and these lemmata will be read and interpreted by humans in applications, it is important to consider the problem of improving the logical and symbolic complexity of the canonical solution. Furthermore, a decrease in the logical complexity of a lemma often yields a decrease in the length of the proof that is constructed from it.
In the following sections, we describe a method which computes small solutions for schematic Herbrand sequents induced by decompositions. The method is incomplete (in the sense that a solution of minimal complexity is missed) but efficient. It is based on resolution and paramodulation.
5.1 On the Solutions for a Single \(\varPi _1\)Cut
The first basic observation is that solvability is a semantic property. The following is an immediate consequence of Definition 30.
Lemma 40
Let A be a solution, B a formula and \(\models A \Leftrightarrow B\). Then B is a solution.
Hence we may restrict our attention to solutions which are in conjunctive normal form (CNF). Formulas in CNF can be represented as sets of clauses, which in turn are sets of literals, i.e. possibly negated atoms. It is this representation that we will use throughout this section, along with the following properties: for sets of clauses A, B, \(A\subseteq B\) implies \(B\models A\), and for clauses C, D, \(C\subseteq D\) implies \(C\models D\).
Note that the converse of the Lemma above does not hold: given a solution A there may be solutions B such that \({\nvDash }A\Leftrightarrow B\). We now turn to the problem of finding such solutions. In Example 39, we observe that that \(C\models A\) (but \(A \not \models C\)). We can generalize this observation to show that the canonical solution is most general.
Lemma 41
Let C be the canonical solution and A an arbitrary solution. Then \(C\models A\).
Proof
Since \(\vartheta = [ X \backslash \lambda {\bar{\alpha }}.A ] \) is a solution for \({\mathcal {I}}\), the sequent \(F [ \overline{x} \backslash \overline{u_1} ] ,\ldots ,F [ \overline{x} \backslash \overline{u_m} ] ,A \supset \bigwedge _{j=1}^{k} A [ {\bar{\alpha }} \backslash \overline{s_{j}} ] \rightarrow \) is Evalid. By definition, \(C = \bigwedge _{i=1}^{m} F [ \bar{x} \backslash \overline{u_i} ] \), and therefore \(C, A\supset \bigwedge _{j=1}^k A [ {\bar{\alpha }} \backslash \overline{s_j} ] \rightarrow \) is Evalid, hence \(C\rightarrow A\) is Evalid. \(\square \)
This result states that any search for simple solutions can be restricted to consequences of the canonical solution. Unfortunately, due to equality in our language, there are infinitely many consequences. Even enumerating all consequences bounded by a fixed bound on symbol size would be computationally infeasible. Towards a more efficient iterative method, we give a criterion that allows us to disregard some of those consequences.
Lemma 42
 (1)
If \(\varGamma ',A [ {\bar{\alpha }} \backslash \overline{s_1} ] ,\ldots ,A [ {\bar{\alpha }} \backslash \overline{s_k} ] \rightarrow \) is not Evalid, then B is not a solution.
 (2)
If A is a solution then \(\varGamma \rightarrow B\) is Evalid.
 (3)
If A is a solution, then \(\varGamma ',B [ {\bar{\alpha }} \backslash \overline{s_1} ] ,\ldots ,B [ {\bar{\alpha }} \backslash \overline{s_k} ] \rightarrow \) is Evalid iff \( [ X \backslash \lambda {\bar{\alpha }}. B ] \) is a solution of \({\mathcal {I}}\).
Proof
For (1), we will show the contrapositive. By assumption, we have that \(\varGamma ',B [ {\bar{\alpha }} \backslash \overline{s_1} ] ,\ldots ,B [ {\bar{\alpha }} \backslash \overline{s_k} ] \rightarrow \) is Evalid. Since \(A \models B\), we find that furthermore \(\varGamma ',A [ {\bar{\alpha }} \backslash \overline{s_1} ] ,\ldots ,A [ {\bar{\alpha }} \backslash \overline{s_k} ] \rightarrow \) is Evalid. For (2) it suffices to observe that since A is a solution \(\varGamma \rightarrow A\) is Evalid, and to conclude by \(A\models B\). (3) is then immediate by definition. \(\square \)
Lemma 43
(Sandwich Lemma) Let A, B be solutions and \(A\models D \models B\). Then D is a solution.
5.2 Simplification by forgetful inference
In this section we define a method to simplify solutions which is based on resolution and paramodulation. The idea behind it is to generate solutions of smaller size by forgetful inference, i.e. if we derive F from \(F_1,F_2\) we replace \(F_1,F_2\) by F. This principle of inference is sound but obviously incomplete. The method is also incomplete in the sense that it might fail to produce the shortest solution; however it proved very useful in practice and is part of our implementation. From now on we assume that the formulas are in clause form, i.e. they are represented as finite sets of clauses (and clauses are considered as finite sets of literals). We may also assume that the clauses are ground (in particular we consider variables from \({\bar{\alpha }}\) as constants). Therefore the principles of resolution and paramodulation used below do not require unification.
Definition 44

\({\mathcal {C}}\rhd _r{\mathcal {C}}'\) if \({\mathcal {C}}' = ( {\mathcal {C}}\setminus \{C_1,C_2\} ) \cup \{R\}\), where \(C_1,C_2 \in {\mathcal {C}}\), \(C_1\ne C_2\) and R is a resolvent of \(C_1\) and \(C_2\) which is not a tautology.

\({\mathcal {C}}\rhd _p{\mathcal {C}}'\) if \({\mathcal {C}}' = ( {\mathcal {C}}\setminus \{C_1,C_2\} ) \cup \{R\}\), where \(C_1,C_2 \in {\mathcal {C}}\), \(C_1\ne C_2\) and R is a paramodulant of \(C_1\) and \(C_2\) which is not a tautology.

\({\mathcal {C}}\rhd _s{\mathcal {C}}'\) if either \({\mathcal {C}}\rhd _r{\mathcal {C}}'\) or \({\mathcal {C}}\rhd _p{\mathcal {C}}'\).
Definition 45
Let \(({\mathcal {C}}_1,\ldots ,{\mathcal {C}}_n)\), \(({\mathcal {D}}_1,\ldots ,{\mathcal {D}}_n)\) be tuples of clause sets for \(n \ge 1\). We define \(({\mathcal {C}}_1,\ldots ,{\mathcal {C}}_n) \rhd _s({\mathcal {D}}_1,\ldots ,{\mathcal {D}}_n)\) if there exists an \(i \in \{1,\ldots ,n\}\) s.t. \({\mathcal {C}}_i \rhd _s{\mathcal {D}}_i\) and for all \(j \le n\) and \(j \ne i\) we have \({\mathcal {D}}_j = {\mathcal {C}}_j\).
Proposition 46
\(\rhd _s\) is sound, i.e. if \(({\mathcal {C}}_1,\ldots ,{\mathcal {C}}_n) \rhd _s({\mathcal {D}}_1,\ldots ,{\mathcal {D}}_n)\) then, for all \(i \in \{1,\ldots ,n\}\), \({\mathcal {C}}_i \models {\mathcal {D}}_i\).
Proof
By the soundness of resolution and paramodulation over equality interpretations we have that \({\mathcal {C}}\rhd _r{\mathcal {C}}'\) ( \({\mathcal {C}}\rhd _p{\mathcal {C}}'\)) implies \({\mathcal {C}}\models {\mathcal {C}}'\). \(\square \)
Proposition 47
\(\rhd _s\) is terminating.
Proof
Remark 48
Example 49
Below we define the set of simplified solution tuples for a set of solution conditions.
Definition 50
Definition 51

the canonical solution tuple of \({\mathcal {I}}\) is in \({ Sol}_s({\mathcal {I}})\),

if \(\varPsi \in { Sol}_s({\mathcal {I}})\), \(\varPsi \rhd _s\varPsi '\) and \(\varPsi '\) is a solution tuple of \({\mathcal {I}}\) then \(\varPsi ' \in { Sol}_s({\mathcal {I}})\).
Proposition 52
Let \({\mathcal {I}}\) be a set of solution conditions. Then \({ Sol}_s({\mathcal {I}})\) is a finite set of solution tuples of \({\mathcal {I}}\) and \({ Sol}_s({\mathcal {I}})\) is computable.
Proof
\({ Sol}_s({\mathcal {I}})\) is finite as, for the canonical solution tuple \(\varPsi _0\), there are only finitely many \(\varPsi \) s.t. \(\varPsi _0 \rhd _s^* \varPsi \) (note that, by Proposition 47, \(\rhd _s\) is terminating). It is computable because it is decidable whether a given tuple of clause sets \(\varPsi \) is a solution tuple of \({\mathcal {I}}\). \(\square \)
There are various ways to extract solution tuples from the set \({ Sol}_s({\mathcal {I}})\). We can either compute a minimal \(\varPsi \), i.e. a \(\varPsi \in { Sol}_s({\mathcal {I}})\) s.t. either all components of \(\varPsi \) are in normal form or \(\varPsi \rhd _s\varPsi '\) implies that \(\varPsi '\) is not a solution anymore. Or we can compute all minimal solution tuples \(\varPsi \in { Sol}_s({\mathcal {I}})\) and select those of minimal logical complexity.
Our implementation iteratively finds one minimal solution in \(\varPsi \in { Sol}_s({\mathcal {I}})\): we start from the canonical solution \(\varPsi = (D_1, \dots , D_n) \in { Sol}_s({\mathcal {I}})\). We process the components of the tuple from right to left, starting at \(D_n\). In each step we minimize one component of the solution tuple, computing all \(\rhd _s\)simplifications, picking one minimal simplification, and replacing that component by the simplification. Performing a simplification at one component preserves the minimality of the components to the right, so we produce a minimal solution after one loop.
There are several heuristics which may further improve the algorithm. One straightforward (but expensive) strategy is to delete a single clause in the clause form and to check whether the formula is still a solution; this feature is built in but is not used in the tests. Another (better) one is to eliminate clauses in the CNFform which do not contain variables from \(\bar{\alpha }\). The example below illustrates advantages and potential problems with this heuristic.
Example 53
Let A be a solution in CNF and construct \(A'\) from A by removing all clauses that do not contain variables from \({\bar{\alpha }}\). Then we have to check whether \(A'\) is a solution.
The example below illustrates the procedure of computing a minimal solution from a canonical solution tuple.
Example 54
We now illustrate the use of forgetful inference in simplifying a solution for two cut formulas.
Example 55
Remark 56
Example 55 shows that the simplification must start from the “rear”, i.e. we must first simplify the solution for \(X_n\), then that for \(X_{n1}\) and so on. The reason is that the simplified formula may be logically weaker and the best place to insert a weaker cut is at the lowermost cut; here only the righthand side of the lowermost cut (that means the last solution condition) has to be checked accordingly. This order of simplification is also implemented and used for te tests.
5.3 Beautifying the Solution
The minimization procedure defined above takes a solution in conjunctive normal form and combines some of the clauses into new clauses. These new clauses form the actual nonanalytic content of the lemma that we generate. However, there can be parts of the CNF that the minimization procedure did not modify—these unmodified parts are then just instances of formulas in the endsequent. In addition, some clauses of the minimized solution may contain literals that already occur in the endsequent and are hence always true.
Example 57
In contrast to solution minimization, we will not only modify the solution, but the decomposition as well. We will first define the operations on the solution, and then show their effect on the decomposition.
Definition 58

\(\mathcal C \cup \{C\} \rhd ^\mathcal {S}_{as} \mathcal C\) if C is subsumed by a clause in the CNF of \(\lnot \mathcal {S}\) (“axiom subsumption”)

\(\mathcal C \cup \{C \cup \{l\}\} \rhd ^\mathcal {S}_{ur} \mathcal C \cup \{C\}\) if \(\lnot l\) is subsumed by a clause in the CNF of \(\lnot \mathcal {S}\) (“unit resolution”)

\((\mathcal C_1, \dots , \mathcal C_i, \dots , \mathcal C_n) \rhd ^\mathcal {S}_b (\mathcal C_1, \dots , \mathcal C_i', \dots , \mathcal C_n)\) if \(\mathcal C_i \rhd ^\mathcal {S}_{as} \mathcal C_i'\) or \(\mathcal C_i \rhd ^\mathcal {S}_{ur} \mathcal C_i'\),

\((\mathcal C_1, \dots , \mathcal C_{i1}, \{\}, \mathcal C_{i+1}, \dots , \mathcal C_n) \rhd ^\mathcal {S}_b (\mathcal C_{i+1}, \dots , \mathcal C_n)\), and

\((\mathcal C_1, \dots , \mathcal C_{i1}, \{\{\}, \dots \}, \mathcal C_{i+1}, \dots , \mathcal C_n) \rhd ^\mathcal {S}_b (\mathcal C_1, \dots , \mathcal C_{i1}, \mathcal C_{i+1}, \dots , \mathcal C_n)\).
Example 59
Lemma 60
Let \(\mathcal {S}\) be a \(\varSigma _1\)sequent. Let F be a solution in CNF for a decomposition D of a term set corresponding to a Herbrand sequent of \(\mathcal {S}\). If \(F \rhd ^\mathcal {S}_b F'\), then there exists a decomposition \(D'\) corresponding to a potentially different Herbrand sequent of \(\mathcal {S}\) such that \(F'\) is a solution for \(D'\).
Proof
Let \(F = (\mathcal C_1, \dots , \mathcal C_i, \dots , \mathcal C_n)\), \(F' = (\mathcal C_1, \dots , \mathcal C_i', \dots , \mathcal C_n)\), and \(D = U \circ S_1 \circ \cdots \circ S_n\). Depending on the operation, we will add new elements to U. That is, we construct a decomposition \(D' = U' \circ S_1 \circ \cdots \circ S_n\) such that all the solution conditions are satisfied for \(D'\) and \(F'\).
First, consider the case that \(\mathcal C_i = \mathcal C \cup \{ C \cup \{l\} \} \rhd ^\mathcal {S}_{ur} \mathcal C \cup \{ C \} = \mathcal C_i'\). Let u be a term that describes an instance I of a formula in \(\mathcal {S}\) such that I implies \(\lnot l\), and set \(U' = U \cup \{u\}\). Assume without loss of generality that this instance is in the antecedent. Now we have \(I \models \mathcal C_i \supset \mathcal C_i'\) and \(\models \mathcal C_i' \sigma \supset \mathcal C_i \sigma \) for any substitution \(\sigma \). This implies that the solution conditions are still satisfied.
Now consider the case that \(\mathcal C_i = \mathcal C_i' \cup \{C\} \rhd ^\mathcal {S}_{as} \mathcal C_i'\). Let u be a term that describes an instance I of a formula in \(\mathcal {S}\) such that I implies C, and set \(U' = U \cup (u \circ S_i)\). Again assume without loss of generality that the instance is in the antecedent. Here we have \(\models \mathcal C_i \supset \mathcal C_i'\), and \(I\sigma \models \mathcal C_i'\sigma \supset \mathcal C_i\sigma \) for every \(\sigma \in S_i\), hence the solution conditions are satisfied as well. \(\square \)
Example 61
After applying axiom subsumption on the clauses \(\{ P c \}\) and \(\{ \lnot P s^6 c \}\), we need to add the instances \(f_1\) and \(f_3\) to U. Since these are already present, the decomposition does not change.
Starting from the minimized solution \(C_0\), we obtain the beautified solution by computing a \(C_b\) such that \(C_0 \,(\rhd ^\mathcal {S}_b)^*\, C_b\), and \(C_b\) cannot be further beautified. We achieve this by exhaustively reducing the solution using \(\rhd ^\mathcal {S}_b\).
Lemma 62
Let \(\mathcal {S}\) be a \(\varSigma _1\)sequent. We define the complexity of a solution \(S = (\mathcal C_1, \dots , \mathcal C_n)\) to be the number of literals, clauses, and formulas contained in the solution: \(\Vert S \Vert = \sum _{i=1}^n (1 + \sum _{C \in \mathcal C_i} (1 + C))\). Then \(\rhd ^\mathcal {S}_b\) strictly decreases the complexity of the solution, and is hence strongly normalizing.
Proof
In each reduction, we either remove a literal, a whole clause, or a formula. \(\square \)
As a concrete strategy, we first apply axiom subsumption, then unit resolution, and at the end use the rules for \(\{\}\) and \(\{\{\}\}\).
6 Implementation and Experiments
6.1 Lattices
We start with a manually formalized proof of \(\mathcal {S}\).^{1} As in the sketched solution to the textbook exercise, this proof first shows transitivity, then antisymmetry, and finally concludes that there exists no cycle of length 4. We run our algorithm on the Herbrand structure of this proof after cutelimination. The algorithm will recover the two lemmas from just the information contained in the Herbrand sequent. This case study thus demonstrates how lemmas can be reflected in the (term)structure of a Herbrand sequent obtained from eliminating these lemmas.
6.2 Largescale Experiments
To demonstrate the wide applicability of our method, we have evaluated Algorithm 2 on a large data set of automatically generated proofs. The TSTP library (Thousands of Solutions from Theorem Provers, see [33]) contains proofs from a variety of automated theorem provers. We selected the firstorder proofs (FOF and CNF) as of November 2015, consisting of a total of 138005 proofs. Of these proofs in the TSTP, GAPT can import 68198 proofs (49.41%) as Herbrand structures. The other proofs could not be imported because they use custom proof formats, do not contain any detailed proof at all, contain cyclic inferences, or because they use other unsupported or unsound inference rules. The imported proofs were produced by superposition and connectionbased provers. Of these Herbrand structures, 32714 are trivial: each term has a different root symbol—that is, each formula in the endsequent is instantiated at most once. Our method cannot generate lemmas for these trivial proofs.
We evaluated our lemma generation method on the remaining 35480 proofs and several different methods to generate decompositions: the \(\varDelta \)table algorithm for a single variable, and many variables with and without row merging, as well as the socalled MaxSATalgorithm of [12] for different parameters.
The next big step is the improvement and beautification of the solution. Figure 2 shows the change of symbolic complexity when going from canonical solution to improved solution and finally to the beautified solution. As the size of the canonical solution varies widely depending on the size of the decomposition, we have normalized the symbolic complexity of the improved and beautified solutions by the symbolic complexity of the canonical solution. We also only show data for proofs where we could actually compute a nontrivial beautified solution. Improvement by itself only manages to reduce the size of the canonical solution in some cases, many solutions are irreducible. However improvement plants the seed for beautification to significantly reduce the size of the solution: after beautification, the typical solution is only a third of the size the canonical solution. During beautification, the size of the decomposition increased on average by 10. This is a small increase compared to the size of the decomposition.
It is hard to measure the effect of the algorithm on proof size. For one, we cannot fairly compare the size of the input proofs in the TSTP to the proofs with cut—simply because they are proofs in different calculi. The proofs in the TSTP are typically resolution proofs, while we produce proofs in LK that are cutfree except for the cuts we introduce. When we compare the produced proofs with cutfree proofs in LK, then we actually observe an increase in proof size. We used the GAPT tableau prover to generate cutfree proofs of the Herbrand sequents (this is the same prover used to generate the cutfree subproofs in the proofs with cut). The proofs with cut are typically 1.5 times longer than the cutfree ones.
Examples of automatically generated lemmas
Problem  Prover  Generated lemma 

SET1906  E  \(complement(complement(x)) = x\) 
SET0475  Metis  \(set\_equal(x_2, x_1) \supset set\_equal(x_1, x_2)\) 
SET175+3  SInE  \((x_1 \cap x_2 \subseteq x_3 \vee sk(x_1 \cap x_2, x_3) \in x_2) \wedge \) 
\((sk(x_2, x_1 \cap x_2) \in x_1 \wedge sk(x_2, x_1 \cap x_2) \in x_2 \supset x_2 \subseteq x_1 \cap x_2)\)  
PUZ0071  E  \(female(x) \supset from\_mars(x) \vee truthteller(x)\) 
RNG119+1  E  \(aElementOf0(x_2, x_1) \wedge aIdeal0(x_1) \supset aElement0(x_2)\) 
GRP0403  SNARK  \(subgroup\_member(identity) \wedge subgroup\_member(x) \supset \) 
\( subgroup\_member(inverse(x)) \)  
SEU154+1  Prover9  \(\lnot in(x, empty\_set)\) 
Table 1 shows a few examples of lemmas that were automatically generated from proofs in the TSTP data set. Our method finds purely equational lemmas, as well as propositionally more complex lemmas.
7 Conclusion and Future Work
We have presented an algorithm for the generation of quantified lemmas and evaluated its implementation. The algorithm takes an analytic proof in the form of a Herbrandsequent as input and creates a sequent calculus proof with \(\varPi _1\)cuts. It is complete in the sense that it permits a reversal of any cutelimination sequence [18]. This algorithms shows that, not only does the structure of an analytic proof reflect lemmas of nonanalytic proofs of the same theorem, but the latter can be reconstructed from the former.
The evaluation of the implementation in the GAPTsystem has demonstrated that it is sufficiently efficient to be applied to proofs generated by automated theorem provers. We have demonstrated it on a case study generating the essential conditions of the definition of a partial order from a proof formulated in the language of lower semilattices.
This algorithm opens up a number of perspectives for future research: it is of prooftheoretic as well as of practical interest to obtain a better understanding of the structural differences between cutfree proofs generated by theorem provers and cutfree proofs generated by cutelimination, in particular: which strategies of theorem provers are likely to generate proofs which have a structure similar to those obtained by cutelimination (and hence permit a significant compression by our method)? Can we modify a given cutfree proof in order to adjust the structure to a more regular one, e.g., by factoring out certain background theories?
We also consider it an interesting foundational endeavor to carry out further case studies along the lines of that in Sect. 6.1 motivated by the question mentioned in the introduction: which central mathematical notions can be justified based on grounds of proofcomplexity alone (as opposed to human legibility of proofs)?
The algorithm for lemma generation described in this paper has been extended to a method for inductive theorem proving in [13]. In [13], the generated nonanalytic formula is the induction invariant. Since the primary goal is to find any inductive proof, concerns about the legibility of proofs as addressed in Sect. 5 become secondary.
Last but not least, we plan to extend the method presented here to cuts with quantifier alternations. There is a satisfactory understanding of the shape of decompositions of more complex cuts, see [1, 2, 3, 4]. The central theoretical problem for an extension in this direction is the question if every such—more complex—decomposition has a canonical solution. In [22], this question has been solved negatively for firstorder logic without equality and a partial algorithm for the introduction of a \(\varPi _2\)cut which is capable of exponential compression has been given. However, for firstorder logic with equality, the question remains open.
Footnotes
 1.
As of GAPT 2.2 this proof is included in examples/poset/posetproof.scala, and examples/poset/deltatable.scala contains a script that performs cutintroduction on that proof.
 2.
Cactus plots have been popularized by the SAT community to visualize the performance of different solvers on a benchmark set, and have since also been adopted by other competitions.
Notes
Acknowledgements
Open access funding provided by TU Wien (TUW). This work is supported by the Vienna Science and Technology Fund (WWTF) project VRG12004.
References
 1.Afshari, B., Hetzl, S., Leigh, G.E.: Herbrand disjunctions, cut elimination and contextfree tree grammars. In: Altenkirch, T., (ed.) International Conference on Typed Lambda Calculi and Applications (TLCA) 2015, LIPIcs, vol. 38, pp. 1–16. Schloss Dagstuhl  LeibnizZentrum fuer Informatik (2015)Google Scholar
 2.Afshari, B., Hetzl, S., Leigh, G.E.: Herbrand confluence for firstorder proofs with \(\Pi _2\)cuts. In: Probst, D., Schuster, P. (eds.) Concepts of Proof in Mathematics, Philosophy, and Computer Science, pp. 5–40. De Gruyter, Berlin (2016)Google Scholar
 3.Afshari, B., Hetzl, S., Leigh, G.E.: On the Herbrand content of LK. In: Kohlenbach, U., van Bakel, S., Berardi,S., (eds.) 6th International Workshop on Classical Logic and Computation (CL&C 2016), EPTCS, vol. 213, pp. 1–10 (2016)Google Scholar
 4.Afshari, B., Hetzl, S., Leigh G.E.: Herbrand’s Theorem as HigherOrder Recursion. Preprint OWP201801, Mathematisches Forschungsinstitut Oberwolfach (2018)Google Scholar
 5.Baaz, M., Zach, R.: Algorithmic structuring of cutfree proofs. In: Computer Science Logic (CSL) 1992. Lecture Notes in Computer Science, vol. 702, pp. 29–42. Springer (1993)Google Scholar
 6.Birkhoff, G.: Lattice Theory, American Mathematical Society Colloquium Publications, vol. XXV, 3rd edn. American Mathematical Society, Providence (1967)Google Scholar
 7.Bundy, A.: The automation of proof by mathematical induction. In: Voronkov, A., Robinson, J.A. (eds.) Handbook of Automated Reasoning, pp. 845–911. Elsevier, Amsterdam (2001)CrossRefGoogle Scholar
 8.Bundy, A., Basin, D., Hutter, D., Ireland, A.: Rippling: MetaLevel Guidance for Mathematical Reasoning, Cambridge Tracts in Theoretical Computer Science. Cambridge University Press, Cambridge (2005)CrossRefMATHGoogle Scholar
 9.Cavagnetto, S.: The lengths of proofs: Kreisel’s conjecture and Gödels speedup theorem. J. Math. Sci. 158(5), 689–707 (2009)MathSciNetCrossRefMATHGoogle Scholar
 10.Colton, S.: Automated theory formation in pure mathematics. Ph.D. thesis, University of Edinburgh (2001)Google Scholar
 11.Colton, S.: Automated Theory Formation in Pure Mathematics. Springer, Berlin (2002)CrossRefMATHGoogle Scholar
 12.Eberhard, S., Ebner, G., Hetzl, S.: Algorithmic compression of finite tree languages by rigid acyclic grammars. ACM Trans. Comput. Log. 18(4), 26:1–26:20 (2017)MathSciNetCrossRefGoogle Scholar
 13.Eberhard, S., Hetzl, S.: Inductive theorem proving based on tree grammars. Ann. Pure Appl. Log. 166(6), 665–700 (2015)MathSciNetCrossRefMATHGoogle Scholar
 14.Ebner, G., Hetzl, S., Reis, G., Riener, M., Wolfsteiner, S., Zivota, S.: System description: GAPT 2.0. In: 8th International Joint Conference on Automated Reasoning, IJCAR (2016)Google Scholar
 15.Finger, M., Gabbay, D.: Equal rights for the cut: computable nonanalytic cuts in cutbased proofs. Log. J. IGPL 15(5–6), 553–575 (2007)MathSciNetCrossRefMATHGoogle Scholar
 16.Gentzen, G.: Untersuchungen über das logische Schließen. Mathematische Zeitschrift 39, 176–210,405–431 (1934–1935)Google Scholar
 17.Hetzl, S., Leitsch, A., Reis, G., Tapolczai, J., Weller, D.: Introducing quantified cuts in logic with equality. In: Demri, S., Kapur, D., Weidenbach, C., (eds.) Automated Reasoning  7th International Joint Conference, IJCAR. Lecture Notes in Computer Science, vol. 8562, pp. 240–254. Springer (2014)Google Scholar
 18.Hetzl, S., Leitsch, A., Reis, G., Weller, D.: Algorithmic introduction of quantified cuts. Theor. Comput. Sci. 549, 1–16 (2014)MathSciNetCrossRefMATHGoogle Scholar
 19.Hetzl, S., Leitsch, A., Weller, D.: Towards algorithmic cutintroduction. In: Logic for Programming, Artificial Intelligence and Reasoning (LPAR18). Lecture Notes in Computer Science, vol. 7180, pp. 228–242. Springer (2012)Google Scholar
 20.Ireland, A., Bundy, A.: Productive use of failure in inductive proof. J. Autom. Reason. 16(1–2), 79–111 (1996)MathSciNetCrossRefMATHGoogle Scholar
 21.Johansson, M., Dixon, L., Bundy, A.: Conjecture synthesis for inductive theories. J. Autom. Reason. 47(3), 251–289 (2011)MathSciNetCrossRefMATHGoogle Scholar
 22.Leitsch, A., Lettmann, M.P.: The problem of \(\Pi _2\)cutintroduction. Theor. Comput. Sci. 706, 83–116 (2018)CrossRefMATHGoogle Scholar
 23.Miller, D., Nigam, V.: Incorporating tables into proofs. In: 16th Conference on Computer Science and Logic (CSL07). Lecture Notes in Computer Science, vol. 4646, pp. 466–480. Springer (2007)Google Scholar
 24.Orevkov, V.: Lower bounds for increasing complexity of derivations after cut elimination. Zapiski Nauchnykh Seminarov Leningradskogo Otdeleniya Matematicheskogo Instituta 88, 137–161 (1979)MathSciNetMATHGoogle Scholar
 25.Plotkin, G.D.: A note on inductive generalization. Mach. Intell. 5(1), 153–163 (1970)MathSciNetMATHGoogle Scholar
 26.Plotkin, G.D.: A further note on inductive generalization. Mach. Intell. 6, 101–124 (1971)MathSciNetMATHGoogle Scholar
 27.Pudlák, P.: The Lengths of Proofs. In: Buss, S. (ed.) Handbook of Proof Theory, pp. 547–637. Elsevier, Amsterdam (1998)CrossRefGoogle Scholar
 28.Reynolds, J.C.: Transformational systems and the algebraic structure of atomic formulas. Mach. Intell. 5(1), 135–151 (1970)MathSciNetMATHGoogle Scholar
 29.Shoenfield, J.R.: Mathematical Logic, 2nd edn. Addison Wesley, Boston (1973)MATHGoogle Scholar
 30.Sorge, V., Colton, S., McCasland, R., Meier, A.: Classification results in quasigroup and loop theory via a combination of automated reasoning tools. Comment. Math. Univ. Carol. 49(2), 319–339 (2008)MathSciNetMATHGoogle Scholar
 31.Sorge, V., Meier, A., McCasland, R., Colton, S.: Automatic construction and verification of isotopy invariants. J. Autom. Reason. 40(2–3), 221–243 (2008)MathSciNetCrossRefMATHGoogle Scholar
 32.Statman, R.: Lower bounds on Herbrand’s theorem. Proc. Am. Math. Soc. 75, 104–107 (1979)MathSciNetMATHGoogle Scholar
 33.Sutcliffe, G.: The TPTP problem library and associated infrastructure: the FOF and CNF parts, v3.5.0. J. Autom. Reason. 43(4), 337–362 (2009)MathSciNetCrossRefMATHGoogle Scholar
 34.Vyskočil, J., Stanovský, D., Urban, J.: Automated proof compression by invention of new definitions. In: Clark, E.M., Voronkov, A., (eds.) Logic for Programming, Artifical Intelligence and Reasoning (LPAR16). Lecture Notes in Computer Science, vol. 6355, pp. 447–462. Springer (2010)Google Scholar
 35.Woltzenlogel Paleo, B.: Atomic cut introduction by resolution: proof structuring and compression. In: Clark, E.M., Voronkov, A., (eds.) Logic for Programming, Artifical Intelligence and Reasoning (LPAR16). Lecture Notes in Computer Science, vol. 6355, pp. 463–480. Springer (2010)Google Scholar
Copyright information
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.