Layer Systems for Confluence—Formalized
Abstract
Toyama’s theorem states that the union of two confluent term rewrite systems with disjoint signatures is again confluent. This is a fundamental result in term rewriting, and several proofs appear in the literature. The underlying proof technique has been adapted to prove further results like persistence of confluence (if a manysorted term rewrite system is confluent, then the underlying unsorted system is confluent) or the preservation of confluence by currying.
In this paper we present a formalization of modularity and related results in Isabelle/HOL. The formalization is based on layer systems, which cover modularity, persistence, currying (and more) in a single framework. The persistence result has been integrated into the certifier CeTA and the confluence tool CSI, allowing us to check confluence proofs based on persistent decomposition, of which modularity is a special case.
1 Introduction

Confluence is persistent [1], i.e., a TRS is confluent if and only if it is confluent as a manysorted TRS. This gives rise to a decomposition technique, and fully subsumes modularity.

Confluence is preserved by currying [11]. Currying is useful, for example, as a preprocessing step for deciding ground confluence.

The notion of modularity has been generalized as well, by weakening the assumption that the signatures of the two TRSs are disjoint; for example, confluence is modular for layerpreserving composable TRSs [16], and for quasiground systems [12].
The list goes on. All of these proofs are based on decomposing terms into a maximal top and remaining aliens, but with different sets of admissible tops. In each case, confluence is established by induction on the number of nested tops in that decomposition (the rank of a term). Layer systems [7] were introduced as an abstraction from these proofs. A layer system \(\mathfrak {L}\) is simply the set of admissible tops; for modularity, those are homogeneous multihole contexts, i.e., multihole contexts whose function symbols all belong to the signature of only one of the two given TRSs. At the heart of layer systems lies an adaptation of the modularity proof in [17]. When establishing confluence by layer systems, as remaining proof obligations, one has to check that a layer system satisfies so called layer conditions, which is easier than doing a full adaptation of the modularity proof.
Isabelle/HOL [15] is an interactive proof assistant based on higherorder logic with a HindleyMilner type system, extended with type classes. It follows the LCF tradition [9] in having a trusted kernel, which ensures that theorems follow from the axioms by construction. Isabelle features a structured proof language [20]. Another useful feature are locales, which allow bundling of functions and assumptions that are shared by several definitions and theorems. (For example, locales are used to model groups in Isabelle/HOL). The locale mechanism in Isabelle is quite powerful; in particular, locales can be instantiated (so \(\mathbb {Z}\) with addition, 0 as unit, and negation is a group) and extended (for example, the group locale is an extension of a semigroup locale, with additional operations (unit and inverse) and assumptions). Our main reason for using Isabelle/HOL is the existing Isabelle Formalization of Rewriting, IsaFoR [18]. In addition to fundamental notions of term rewriting like terms, substitutions, contexts, multihole contexts, and so on, IsaFoR is also the foundation of CeTA (Certified Tool Assertions), which can certify termination and confluence proofs, among other things.
In this paper we describe a formalization of layer systems in Isabelle/HOL as part of IsaFoR. In fact, the prospect of formalization was one of the selling points of layer systems, with the idea of making large parts of the proof reusable. Note that whereas adapting existing proofs is convenient on paper, it becomes a burden when done in a formalization. The resulting duplication of code (that is, theorem statements and proofs) would decrease maintainability and is therefore best avoided. Our effort covers modularity of confluence, persistence of confluence, and preservation of confluence by currying for first order term rewrite systems. To the best of our knowledge, this is the first time that any of these results has been fully formalized in a proof assistant.
From a practical perspective, our interest in formalization is motivated by our work on an automated confluence prover, CSI [14]. As with all software, CSI potentially contains bugs. In order to increase the trust in CSI, proof output in a machine readable format is supported, which can be checked using CeTA [18]. As part of our formalization effort, we have extended CeTA with support for a decomposition technique based on persistence of confluence, allowing CSI and potentially other confluence tools to produce certifiable proofs using this technique. We have prepared a website with examples and information about the used software at http://clinformatik.uibk.ac.at/software/lisa/ictac2018/.
For most theorems and many definitions, we provide the corresponding identifiers in the formalization; in the PDF version of this paper, they link to the HTML version of the formalization itself. Furthermore, links to selected defined symbols can be found on our website.
The remainder of this paper is structured as follows. We recall notations and basic definitions in Sect. 2. Then we present the layer conditions, which are central to our formalization, in Sect. 3. The next two sections are about persistence. Section 4 uses persistence as an example to illustrate how layer systems can be applied to obtain a confluence result, while Sect. 5 focuses on the persistent decomposition. In Sect. 6, we present details of the currying application. Finally, we conclude in Sect. 7.
2 Preliminaries
We use standard notation from term rewriting [3]. Let \(\mathcal {F}\) be a signature and \(\mathcal {V}\) be a set of variables. Then \(\mathcal {T}(\mathcal {F},\mathcal {V})\) is the set of terms over that signature. We denote by \(\mathcal {P}os(t)\) the set of positions of t. The subterm of t at position p is \(t_p\), and \(t[s]_p\) is the result of replacing the subterm at position p in t by s. We also write \(\mathcal {P}os_X(r)\) for the set of positions p of t such that the root symbol of \(t_p\) is in X. If \(X = \{x\}\) is a singleton set, we may omit the outer curly braces and write \(\mathcal {P}os_x(t)\). The set of variables of t is \(\mathcal {V}ar(t)\). The set of multihole contexts over \(\mathcal {F}\) and \(\mathcal {V}\) is denoted by \(\mathcal {C}(\mathcal {F},\mathcal {V})\). (Multihole contexts are terms that may contain occurrences of an extra constant \(\square \), representing their holes.) If C is a multihole context with n holes, then \(C[t_1,\dots ,t_n]\) denotes the term obtained by replacing the ith hole in C by \(t_i\) for \(1 \leqslant i \leqslant n\). On multihole contexts, we have a partial order \(\sqsubseteq \) which is generated by \(\square \sqsubseteq C\) and closure under contexts (\(D \sqsubseteq D'\) implies \(C[D] \sqsubseteq C[D']\)). The corresponding partial supremum operation is denoted by \(\sqcup \); intuitively it merges two multihole contexts.
A substitution \(\sigma ,\tau ,\dots \) is a map from variables to terms. The result of applying the substitution \(\sigma \) to the term t is denoted by \(t\sigma \). A term rewrite system (TRS) \(\mathcal {R}\) is a set of rules \(\ell \rightarrow r\), where \(\ell \) and r are terms, \(\ell \) is not a variable, and \(\mathcal {V}ar(r) \subseteq \mathcal {V}ar(\ell )\). There is a rewrite step from s to t ( Open image in new window ) if \(s = s[\ell \sigma ]_p\) and \(t = s[r\sigma ]_p\) for a position \(p \in \mathcal {P}os(s)\) and substitution \(\sigma \).
Given a relation \(\rightarrow \), we write Open image in new window and Open image in new window for its inverse and its reflexive transitive closure, respectively. A relation \(\rightarrow \) is confluent if Open image in new window implies Open image in new window . It is confluent on X if for all \(s \in X\), Open image in new window implies Open image in new window .^{1}
3 Layer Conditions
In the layer system approach to confluence, one sets up a layer system for a TRS \(\mathcal {R}\) that satisfies the socalled layer conditions. These layer conditions constitute the interface between the reusable part of the formalization and the parts that are specific to a particular application of layer systems (e.g., modularity). Since they are central to the formalization, we recall the basic constructions and the layer conditions here. For full details please refer to [7].
Recall that modularity of confluence states that the union of two TRSs over disjoint signatures is confluent if each of the two TRSs is confluent (the converse is also true and fairly easy to prove). Modularity is proved by induction on the rank of a term; to obtain the rank, one decomposes the term into alternating layers of multihole contexts over the two signatures; the rank is the maximum nesting depth of the resulting layers.
Example 1
Let \(\mathcal {F}_1 = \{\mathsf {A},\mathsf {F}\}\) and \(\mathcal {F}_2 = \{\mathsf {b},\mathsf {g}\}\). Then \(\mathop {\mathrm {rank}}(\mathsf {F}(\mathsf {F}(\mathsf {A}))) = 1\), while \(\mathop {\mathrm {rank}}(\mathsf {g}(\mathsf {b}, \mathsf {F}(\mathsf {b}))) = 3\); the latter term is decomposed into \(\mathsf {g}(\mathsf {b}, \square )\), \(\mathsf {F}(\square )\) and \(\mathsf {b}\).
Layer systems abstract from this situation by considering all possible multihole contexts at the top of such a decomposition. So a layer system is a set of multihole contexts, and gives rise to tops and maximal tops as follows.
Definition 2
([7, Definition 3.1]). Let \(\mathcal {F}\) be a signature and \(\mathcal {V}\) be an infinite set of variables. Let \(\mathfrak {L}\subseteq \mathcal {C}(\mathcal {F},\mathcal {V})\) be a set of multihole contexts over \(\mathcal {F}\). Then \(L \in \mathfrak {L}\) is called a top of a context \(C \in \mathcal {C}(\mathcal {F},\mathcal {V})\) (according to \(\mathfrak {L}\)) if \(L \sqsubseteq C\). A top is a maxtop of C if it is maximal with respect to \(\sqsubseteq \) among the tops of C.
We want to prove that all terms are confluent, provided that terms of rank 1 are confluent. To this end we have to impose certain restrictions on the layer system.

the rank must be welldefined, which is ensured if any term has a unique maxtop that is not empty (i.e., not equal to \(\square \));

a rewrite step must span several layers (so it can be mimicked by a suitable rank 1 term); and

the rank must not increase by rewriting.
Example 3
 (a)
Here, we have the rewrite step \(\mathsf {f}(\mathsf {c}, \mathsf {c}) \rightarrow \mathsf {f}(\mathsf {c}, \mathsf {g}(\mathsf {c}))\), decomposed by some set of layers \(\mathfrak {L}\). However, the \(\mathsf {c}\) subterm becomes two layers after the rewrite step, increasing the rank. So rewriting a layer must again result in a layer.
 (b)
This is the same rewrite step as in (a). In this example, \(\mathsf {g}(\mathsf {c})\) may be a layer. However, the resulting term merges with the layer above (a phenomenon we call fusion). In the example, the fusion is partial; the fused context is broken apart. This is caused by there being a layer \(\mathsf {f}(\square , \mathsf {g}(\square ))\) but no layer \(\mathsf {f}(\square , \mathsf {g}(\mathsf {c}))\).
 (c)In this example, there is a root step \(\mathsf {h}(\mathsf {c}, \mathsf {c}) \rightarrow \mathsf {g}(\mathsf {h}(\mathsf {c}, \mathsf {c}))\). Note that both \(\mathsf {c}\) constants in the result originate in the isolated \(\mathsf {c}\), but nevertheless, one of them has fused with the top in the result (so the rewrite step takes place above the point where fusion happens, hence fusion from above). In [7, Example 3.4] we show that the TRShas a set of layers such that fusion from above is the sole reason for the system being nonconfluent despite being confluent on terms of rank 1.$$\begin{aligned} \mathsf {f}(x,x) \rightarrow \mathsf {a} \qquad \qquad \mathsf {f}(x,\mathsf {g}(x)) \rightarrow \mathsf {b} \qquad \qquad \mathsf {h}(\mathsf {c},x) \rightarrow \mathsf {g}(\mathsf {h}(x,x)) \end{aligned}$$
 (d)
Finally, it may happen that a rewrite step triggers fusion in a position that is parallel to the rewrite step. (aliens are what remains of a term after taking away its maxtop; here a rewrite step in one alien causes another alien to fuse, hence conspiring aliens). As far as we know, this is not actually an obstruction to confluence, but nevertheless absence of conspiring aliens is required for our proof.
Definition 4
([7, Definition 3.3]). Let \(\mathcal {F}\) be a signature. A set \(\mathfrak {L}\subseteq \mathcal {C}(\mathcal {F},\mathcal {V})\) of contexts is called a layer system^{2} if it satisfies properties (L\(_1\)), (L\(_2\)), and (L\(_3\)). The elements of \(\mathfrak {L}\) are called layers. A TRS \(\mathcal {R}\) over \(\mathcal {F}\) is weakly layered (according to a layer system \(\mathfrak {L}\)) if condition (W) is satisfied for each \(\ell \rightarrow r \in \mathcal {R}\). It is layered (according to a layer system \(\mathfrak {L}\)) if conditions (W), (C\(_1\)), and (C\(_2\)) are satisfied. The conditions are as follows.

(L\(_1\)) Each term in \(\mathcal {T}(\mathcal {F},\mathcal {V})\) has a nonempty top.

(L\(_2\)) If \(x \in \mathcal {V}\) and \(C \in \mathcal {C}(\mathcal {F},\mathcal {V})\) then \(C[x]_p \in \mathfrak {L}\) if and only if \(C[\square ]_p \in \mathfrak {L}\).

(L\(_3\)) If \(L, N \in \mathfrak {L}\), \(p \in \mathcal {P}os_\mathcal {F}(L)\), and \(L_p \sqcup N\) is defined then \(L[L_p \sqcup N]_p \in \mathfrak {L}\).

(W) If M is a maxtop of s, \(p \in \mathcal {P}os_\mathcal {F}(M)\), and \(s \rightarrow _{p,\ell \rightarrow r} t\) then \(M \rightarrow _{p,\ell \rightarrow r} L\) for some \({L \in \mathfrak {L}}\).

(C\(_1\)) In (W) either L is a maxtop of t or \(L = \square \).

(C\(_2\)) If \(L, N \in \mathfrak {L}\) and \(L \sqsubseteq N\) then \(L[N_p]_p \in \mathfrak {L}\) for any \(p \in \mathcal {P}os_\square (L)\).
In a nutshell, (L\(_1\)) and (L\(_3\)) ensure that the rank is welldefined. Property (L\(_2\)) is a technical property that ensures that aliens can always be represented by suitable variables in the confluence proof. Condition (W) prevents breaking layers, and together with (L\(_3\)), fusion from above. The final two conditions, (C\(_1\)) and (C\(_2\)), prevent fusion from above and conspiring aliens, respectively. Now, let us formally define the rank and aliens of a term.
Definition 5
([7, Definition 3.6]). Let \(t = M[{t_1},\dots ,{t_{n}}]\) with M the maxtop of t. We define \(\mathop {\mathrm {rank}}(t) = 1 + \max \{ \mathop {\mathrm {rank}}(t_i) \mid 1 \leqslant i \leqslant n \}\), where \(\max (\varnothing ) = 0\) (\({t_1},\dots ,{t_{n}}\) are the aliens of t).
The main theorems of [7] are as follows (we omit [7, Theorem 4.3] because it has yet to be formalized).
Theorem 6
([7, Theorem 4.1]). Let \(\mathcal {R}\) be a weakly layered TRS that is confluent on terms of rank one. If \(\mathcal {R}\) is leftlinear then \(\mathcal {R}\) is confluent.
Theorem 7
([7, Theorem 4.6]). Let \(\mathcal {R}\) be a layered TRS that is confluent on terms of rank one. Then \(\mathcal {R}\) is confluent.
Within the formalization, Theorem 6 is established inside the weakly_layered locale as theorem weakly_layered.CR_ll , whereas Theorem 7 is holds in the layered locale as theorem layered.CR . (In fact these statements are declared as locale assumptions; they become theorems by proving suitable sublocale relationships. This is done in LS_Left_Linear.thy and LS_General.thy). The proofs of these main results correspond to Sect. 4 of [7]. The (lengthy) proof works by induction on the rank: assuming that terms of rank r are confluent, several auxiliary results are derived, and finally, confluence of terms of rank \(r+1\) follows. To this end, we use two more locales weakly_layered_induct and weakly_layered_induct_dd that capture the induction hypothesis, and an auxiliary assumption (namely that local peaks of so called short steps are joinable in a suitable way), respectively. For this use of locales it is crucial that they can be interpreted inside of a proof, since the induction hypothesis cannot be established for arbitrary r outside of an induction proof. This happens in the proof of the main lemma [7, Lemma 4.27] which we give in Fig. 4. Note that it does induction on the rank (called rk in the proof), and that it uses an interpret command to instantiate the weakly_layered_induct_dd locale based on the induction hypothesis inside the proof.
One major benefit of using locales is separation of concerns; thanks to the abstraction of the layer conditions as locales, we could already work on the applications like modularity and currying before the proofs of the main results were complete, without having to worry about working with different assumptions. Basically, each application is an instantiation of these locales, which we could establish independently of the main results.
4 Persistence
To give an impression of what an application of layer systems entails, let us consider the case of persistence. This section overlaps with [7, Section 5.5], but here we focus on interesting aspects in the context of our formalization. In fact, given that the results presented here are both formalized and previously published, we focus on ideas rather than giving full proofs.
Definition 8
We wish to establish the following theorem using layer systems.
Theorem 9
(manysorted persistence, CR_persist ). Let \(\mathcal {R}\) be a manysorted TRS. We let \(\mathcal {V}= \bigcup _{\alpha \in \mathcal {S}} \mathcal {V}_\alpha \). Then \(\mathcal {R}\) is confluent on \(\mathcal {T}_\alpha \) for all \(\alpha \in \mathcal {S}\) if and only if \(\mathcal {R}\) is confluent on \(\mathcal {T}(\mathcal {F},\mathcal {V})\).
Lemma 10
( push_mt_subst , push_mt_ctxt ). The following properties hold for \(\mathsf {mt}_\alpha \).

if \(s \in \mathcal {T}_\alpha \) then \(\mathsf {mt}_\alpha (s \sigma )\) = \(s \sigma '\) where \(\sigma '(x) = \mathsf {mt}_\alpha (\sigma (x))\) for \(x \in \mathcal {V}_\alpha \); and

if \(p \in \mathcal {P}os(\mathsf {mt}_\alpha (t))\), then for some \(\beta \in \mathcal {S}\), all terms s satisfy \(\mathsf {mt}_\alpha (t[s]_p) = \mathsf {mt}_\alpha (t)[\mathsf {mt}_\beta (s)]\).
Remark 11
This idea of defining the maxtop as a function is a recurring theme; it features in the formalizations of modularity and currying as well. The main benefit of (recursive) functions is that they come with an induction principle that is not available for the implicit notion of a “maximal top”.
After showing that \(\mathfrak {L}\) layers \(\mathcal {R}\), Theorem 7 yields the following corollary.
Corollary 12
( CR_on_union ). If \(\mathcal {R}\) is confluent on \(\mathfrak {L}\cap \mathcal {T}(\mathcal {F},\mathcal {V})\),^{3} then \(\mathcal {R}\) is confluent on \(\mathcal {T}(\mathcal {F},\mathcal {V})\).
Let us now sketch a proof of Theorem 9. First note that if \(\mathcal {R}\) is a manysorted TRS, then the sets \(\mathcal {T}_\alpha \) are closed under rewriting by \(\mathcal {R}\); hence confluence of \(\mathcal {R}\) on \(\mathcal {T}(\mathcal {F},\mathcal {V})\) implies confluence of \(\mathcal {R}\) on \(\mathcal {T}_\alpha \) for any \(\alpha \in \mathcal {S}\). For the converse, we want to use Corollary 12. We need to show that \(\mathcal {R}\) is confluent on \(\mathfrak {L}\cap \mathcal {T}(\mathcal {F},\mathcal {V})\). To this end, assume that \(s \in \mathfrak {L}\cap \mathcal {T}(\mathcal {F},\mathcal {V})\), and we have a peak Open image in new window . If s is a variable then \(s = t = u\) and we’re done. Otherwise, we can read off the sort \(\alpha \) of s from its root symbol. Note that s is not necessarily an element of \(\mathcal {T}_\alpha \), because \(\mathfrak {L}\) disregards the sorts of variables. We modify s in two steps; first we annotate each variable with the type that is induced by its context (i.e., if x is the ith argument of \(f : \beta _1 \times \dots \times \beta _n \rightarrow \gamma \), then we replace it by \((x,\beta _i)\));^{4} and secondly we rename the annotated variables in such a way that each \((v,\beta )\) is replaced by an element of \(\mathcal {V}_\beta \). In this fashion, we obtain a peak Open image in new window , where \(s', t', u' \in \mathcal {T}_\alpha \), and a substitution \(\sigma \) with \(s = s'\sigma \), \(t = t'\sigma \) and \(u = u'\sigma \). By confluence of \(\mathcal {R}\) on \(\mathcal {T}_\alpha \), there is a valley Open image in new window , and hence a corresponding valley Open image in new window in \(\mathfrak {L}\cap \mathcal {T}(\mathcal {F},\mathcal {V})\).
5 Persistent Decomposition
Aoto and Toyama [1] pointed out that persistence gives rise to a decomposition technique for proving confluence. The basic idea is to attach sorts to a TRS. To obtain a decomposition, for each sort of the manysorted TRS obtained in that way, the set of rules that are applicable to terms of that sort is computed. By persistence, if all of the resulting systems are confluent, the original TRS is confluent as well. In [2] a refined version of the persistent decomposition is presented, wherein only the maximal systems w.r.t. the subset relation are considered.
Example 13
Definition 14
Remark 15
The notation \(\trianglerighteq \) is justified by the fact that \(\mathcal {T}_\alpha \ni s \trianglerighteq t \in \mathcal {T}_\beta \) implies \(\alpha \trianglerighteq \beta \). Note further that \(\alpha \trianglerighteq \beta \) implies \(\mathcal {R}_\alpha \supseteq \mathcal {R}_\beta \), so the maximal induced TRSs \(\mathcal {R}_\alpha \) w.r.t. subsets are induced by the maximal sorts \(\alpha \) w.r.t. \(\trianglerighteq \).
Since only rules from \(\mathcal {R}_\alpha \) are applicable to terms in \(\mathcal {T}_\alpha \), we have the following lemma.
Lemma 16
( CR_on_ \(\mathcal {T}_{\alpha }\) _by_needed_rules ). The system \(\mathcal {R}\) is confluent on \(\mathcal {T}_\alpha \) if and only if \(\mathcal {R}_\alpha \) is confluent on \(\mathcal {T}_\alpha \).
We formalize the persistent decomposition result as follows.
Theorem 17
( persistent_decomposition_nm ). Let \(\varSigma \subseteq \mathcal {S}\) be a set of sorts with the property that for each \(\beta \in \mathcal {S}\), either \(\mathcal {R}_\beta = \varnothing \), or \(\alpha \in \varSigma \) for some \(\alpha \trianglerighteq \beta \). Then \(\mathcal {R}\) is confluent on \(\mathcal {T}(\mathcal {F},\mathcal {V})\) if and only if \(\mathcal {R}_\alpha \) is confluent on \(\mathcal {T}(\mathcal {F},\mathcal {V})\) for all \(\alpha \in \varSigma \).
Since no proof has been given in the literature^{5} (as far as we know), we include one here.
Proof
First assume that \(\mathcal {R}_\alpha \) is confluent on \(\mathcal {T}(\mathcal {F},\mathcal {V})\) for all \(\alpha \in \varSigma \). By Theorem 9, confluence of \(\mathcal {R}\) on \(\mathcal {T}(\mathcal {F},\mathcal {V})\) follows if we can show that \(\mathcal {R}\) is confluent on \(\mathcal {T}_\beta \) for any \(\beta \in \mathcal {S}\). By Lemma 16, this is equivalent to \(\mathcal {R}_\beta \) being confluent on \(\mathcal {T}_\beta \). If \(\mathcal {R}_\beta = \varnothing \), we are done. Otherwise, by assumption, there is a sort \(\alpha \trianglerighteq \beta \) such that \(\mathcal {R}_\alpha \) is confluent on \(\mathcal {T}(\mathcal {F},\mathcal {V})\). Because \(\mathcal {T}_\beta \) is closed under rewriting by \(\mathcal {R}_\alpha \), \(\mathcal {R}_\alpha \) is confluent on \(\mathcal {T}_\beta \), which implies that \((\mathcal {R}_\alpha )_\beta = \mathcal {R}_\beta \) is confluent on \(\mathcal {T}_\beta \) by Lemma 16 and the fact that \(\mathcal {R}_\alpha \) is a manysorted TRS using the same signature as \(\mathcal {R}\).
For the other direction, assume that \(\mathcal {R}\) is confluent on \(\mathcal {T}(\mathcal {F},\mathcal {V})\). We show that \(\mathcal {R}_\alpha \) is confluent on \(\mathcal {T}(\mathcal {F},\mathcal {V})\) for all \(\alpha \in \mathcal {S}\) (and in particular those in \(\varSigma \)). Since \(\mathcal {R}_\alpha \) is a manysorted TRS, it is persistent (Theorem 9), so it suffices to show that \(\mathcal {R}_\alpha \) is confluent on \(\mathcal {T}_\beta \) for all \(\beta \in \mathcal {S}\). So consider a peak Open image in new window . We proceed by induction on \(s \in \mathcal {T}_\beta \).
If \(s \in \mathcal {V}\) then \(s = t = u\) and we are done. Otherwise, \(s = f(s_1,\dots ,s_n)\) for some \(f : \beta _1 \times \dots \times \beta _n \rightarrow \beta \), and \(s_1 \in \mathcal {T}_{\beta _1}\), ..., \(s_n \in \mathcal {T}_{\beta _n}\). There are two cases.
 1.
If \(\alpha \trianglerighteq \beta \), then since \(\mathcal {R}\) is confluent on \(\mathcal {T}_\beta \), \(\mathcal {R}_\beta \) is confluent on \(\mathcal {T}_\beta \). By Lemma 16 applied to \((\mathcal {R}_\alpha )_\beta = \mathcal {R}_\beta \), \(\mathcal {R}_\alpha \) is confluent on \(\mathcal {T}_\beta \) as well.
 2.
If \(\alpha \not \trianglerighteq \beta \), then \(\mathcal {R}_\alpha \) contains no rules whose root symbol has result sort \(\beta \). Consequently there cannot be any root steps in Open image in new window . Hence we obtain \(t_1,\dots ,t_n\) and \(u_1,\dots ,u_n\) with Open image in new window for \(1 \leqslant i \leqslant n\), \(t = f(t_1,\dots ,t_n)\), and \(u = f(u_1,\dots ,u_n)\). We conclude by the induction hypothesis (\(s_i\) is confluent for \(1 \leqslant i \leqslant n\)). \(\square \)
We further integrated this result into CeTA. To this end, we implemented a function that computes the maximal sorts (with respect to \(\trianglerighteq \)) for a given signature, a check function that checks the preconditions of Theorem 17, and extended CeTA ’s CPF parser with a certificate format for a persistent decomposition (CPF is an XML format. The fragment for persistent decomposition is given in Fig. 6, and may be of interest to tool authors who want to incorporate certifiable persistent decomposition into their confluence tools).
6 Currying
Currying is the most complicated application of layer systems that we have formalized so far. Currying is a transformation of term rewrite systems in which applications of nary functions are replaced by n applications of a single fresh binary function symbol to a constant, thereby applying arguments to the function one by one. More formally, we introduce a fresh function symbol \(\bullet \) to denote application, whereas every other function symbol becomes a constant. We adopt the convention of writing \(f_n\) to denote a function symbol of arity n. Moreover, we denote the arity of a function symbol f with respect to the signature \(\mathcal {F}\) by \(\mathsf {a}_\mathcal {F}(f)\). We identify \(f_{\mathsf {a}_\mathcal {F}(f)}\) with f.
Definition 18
Given a TRS \(\mathcal {R}\) over a signature \(\mathcal {F}\), its curried version \({\mathsf {Cu}}(\mathcal {R})\) consists of rules \(\{{\mathsf {Cu}}(l) \rightarrow {\mathsf {Cu}}(r) \mid \ell \rightarrow r \in \mathcal {R}\}\), where \({\mathsf {Cu}}(t) = t\) if t is a variable and \({\mathsf {Cu}}(f(t_1, \dots , t_n)) = f_0 \bullet {\mathsf {Cu}}(t_1) \bullet \cdots \bullet {\mathsf {Cu}}(t_n)\). Here \(\bullet \) is a fresh leftassociative function symbol.
Currying is useful for deciding properties such as confluence [5] or termination [10]. For analyzing confluence by currying, the following result is important.
Theorem 19
( main_result_complete ). Let \(\mathcal {R}\) be a TRS. If \(\mathcal {R}\) is confluent, then \(\mathsf {Cu}(\mathcal {R})\) is confluent.
This result was proved by Kahrs [11]. Rather than working directly with \(\mathsf {Cu}(\mathcal {R})\), Kahrs works with the partial parametrization of \(\mathcal {R}\), which is given by \(\mathsf {PP}(\mathcal {R}) = \mathcal {R}\cup \mathcal {U}_\mathcal {F}\), where \(\mathcal {U}_\mathcal {F}\) is the set of uncurrying rules for \(\mathcal {F}\) (see Definition 20). Confluence of \(\mathsf {PP}(\mathcal {R})\) and \(\mathsf {Cu}(\mathcal {R})\) are closely related, cf. Lemma 21.
Definition 20
Lemma 21
([11, Proposition 3.1]). Let \(\mathcal {R}\) be a TRS. Then \(\mathsf {Cu}(\mathcal {R})\) is confluent if \(\mathsf {PP}(\mathcal {R})\) is.
Whereas for Lemma 21 we could follow the lines of the paper proof, the formalization of the fact that \(\mathsf {PP}(\mathcal {R})\) is layered according to \(\mathfrak {L}\) turned out to be much more tedious. As with the modularity and persistence applications, we found it convenient to define functions that compute the maxtop of a term, since the abstract definition of maxtops in the layer framework is not really suitable for proofs in Isabelle.
Definition 22
After proving the correctness of \(\mathsf {mt_1}\) and \(\mathsf {mt_{Cu}}\), the main difficulty was the proof of condition (C\(_1\)) for \(\mathfrak {L}\) and \(\mathsf {PP}(\mathcal {R})\). Similar to Lemma 10, we proved facts about the interaction of \(\mathsf {mt_1}\) (and hence \(\mathsf {mt_{Cu}}\)) with contexts and substitutions, in order to analyze a rewrite step \(s = C[l\sigma ]_p \rightarrow C[r\sigma ]_p\) with p a function position of the maxtop M of s.
Lemma 23
( push_mt_in_ctxt ). Let s be a term and p the hole position of context C such that \(C[s]_p \in \mathcal {T}(\mathcal {F}^{\bullet },\mathcal {V})\) and \(p \in \mathcal {P}os_{\mathcal {F}^{\bullet }}(\mathsf {mt_1}(C[s],j))\). Then there exists a context D and a natural number k such that \(\mathsf {mt_1}(C[s],j) = D[\mathsf {mt_1}(s,k)]\), and \(\mathsf {mt_1}(C[t],j) = D[\mathsf {mt_1}(t,k)]\) for any term \(t \in \mathcal {T}(\mathcal {F}^{\bullet },\mathcal {V})\) having the same number of missing arguments as s.
Lemma 24
( push_mt_in_subst ). Let \(t \in \mathcal {T}(\mathcal {F},\mathcal {V})\). Then \(\mathsf {mt_1}(t \cdot \sigma ,0) = \mathsf {mt_1}(t,0) \cdot \sigma '\) with \(\sigma ' = (\lambda x.\,\mathsf {mt_1}(x,0)) \circ \sigma \).
Remark 25
7 Conclusion
We have presented a formalization of modularity, persistence, and currying, in the Isabelle proof assistant. The formalization spans about 12k lines of theory files and took approximately 9 personmonths to develop. A breakdown of the effort is given in Fig. 7. (Note that modularity is subsumed by persistence. We formalized modularity first because it is the easiest application. Many proof ideas for modularity carried over to the other, more difficult applications.) The de Bruijn factor (which compares the size of the formalized proof to the paper version) varies wildly. We believe that the main reason for this is that the level of detail for proofs in [7] varies greatly; the core confluence proof (leading up to Theorem 7) is carried out in much more detail than the applications, where large parts of the proofs rely on the reader’s intuition. A second contributing factor is that two people worked on different parts of the formalization.
We integrated the persistence result into our theorem prover CSI (which already supported ordersorted persistence, so the main effort for extending CSI was adding the XML output.) We present experimental results in Fig. 8. The check mark Open image in new window indicates certified strategies; CSI Open image in new window and +pd Open image in new window are the certified strategies with and without persistent decomposition, respectively, while CSI refers to the uncertified, full strategy of CSI. As can be seen from the data, we have achieved a modest improvement in certified proofs over the Cops database of confluence problems.^{6} It is worth noting that there is no progress in certified nonconfluence proofs; in fact, there is no certification gap for nonconfluence at all. For nonconfluence, CSI employs tree automata [8], which (in theory, and evidently also in practice) subsume the manysorted decomposition result, because manysorted terms are a regular tree language.
There are several parts of [7] that have not yet been formalized. For one, there are two more applications of layer systems, namely modularity of layerpreserving composable TRSs, and a modularity result for quasiground systems. The bigger missing part are variablerestricted layer systems, which are the foundation for a generalized persistence result with ordered sorts [7, Theorem 6.3]. Furthermore, while we have formalized preservation of confluence by currying, this is not integrated into CeTA. As far as we know, no confluence tool currently uses currying directly. However, currying is the basis of efficient decision procedures for ground TRSs, which are implemented in CSI, and are a target for future formalization efforts.
Footnotes
 1.
Another reasonable definition for “\(\rightarrow \) is confluent on X” would be that \({\rightarrow } \cap (X\times X)\) is confluent; this is equivalent to the given definition whenever X is closed under rewriting by \(\rightarrow \).
 2.
 3.
Because multihole contexts are not terms, this is \(\{t.~ mctxt\_of\_term~ t ~ \in \mathfrak {L}\}\) in the formalization.
 4.
This annotation procedure formalizes the following sentence in the proof of [7, Theorem 5.13]: “Note that for each p the sort of \(s'_p\) is uniquely determined by s.”.
 5.
The proof is not difficult, but as a system description, [2] lacked space for a proof.
 6.
Full results are available at http://clinformatik.uibk.ac.at/software/lisa/ictac2018/.
References
 1.Aoto, T., Toyama, Y.: Extending persistency of confluence with ordered sorts. Technical report ISRR960025F, School of Information Science, JAIST (1996)Google Scholar
 2.Aoto, T., Yoshida, J., Toyama, Y.: Proving confluence of term rewriting systems automatically. In: Treinen, R. (ed.) RTA 2009. LNCS, vol. 5595, pp. 93–102. Springer, Heidelberg (2009). https://doi.org/10.1007/9783642023484_7CrossRefzbMATHGoogle Scholar
 3.Baader, F., Nipkow, T.: Term Rewriting and All That. Cambridge University Press, Cambridge (1998). https://doi.org/10.1017/cbo9781139172752CrossRefzbMATHGoogle Scholar
 4.Ballarin, C.: Locales: a module system for mathematical theories. J. Autom. Reason. 52(2), 123–153 (2014). https://doi.org/10.1007/s1081701392847MathSciNetCrossRefzbMATHGoogle Scholar
 5.Felgenhauer, B.: Deciding confluence of ground term rewrite systems in cubic time. In: Tiwari, A. (ed.) Proceedings of 23rd International Conference on Rewriting Techniques and Applications. RTA 2012, May–June 2012, Nagoya. Leibniz International Proceedings in Informatics, vol. 15, pp. 165–175. Dagstuhl Publishing, Saarbrücken, Wadern (2012). https://doi.org/10.4230/lipics.rta.2012.165
 6.Felgenhauer, B.: Decreasing diagrams II. AFP, formal proof development (2015). https://www.isaafp.org/entries/DecreasingDiagramsII.html
 7.Felgenhauer, B., Middeldorp, A., Zankl, H., van Oostrom, V.: Layer systems for proving confluence. ACM Trans. Comput. Log. 16(2), 14 (2015). https://doi.org/10.1145/2710017MathSciNetCrossRefzbMATHGoogle Scholar
 8.Felgenhauer, B., Thiemann, R.: Reachability, confluence, and termination analysis with statecompatible automata. Inf. Comput. 253(3), 467–483 (2017). https://doi.org/10.1016/j.ic.2016.06.011MathSciNetCrossRefzbMATHGoogle Scholar
 9.Gordon, M.J., Milner, A.J., Wadsworth, C.P.: Edinburgh LCF. LNCS, vol. 78. Springer, Heidelberg (1979). https://doi.org/10.1007/3540097244CrossRefzbMATHGoogle Scholar
 10.Hirokawa, N., Middeldorp, A., Zankl, H.: Uncurrying for termination. In: Cervesato, I., Veith, H., Voronkov, A. (eds.) LPAR 2008. LNCS (LNAI), vol. 5330, pp. 667–681. Springer, Heidelberg (2008). https://doi.org/10.1007/9783540894391_46CrossRefzbMATHGoogle Scholar
 11.Kahrs, S.: Confluence of curried termrewriting systems. J. Symb. Comput. 19(6), 601–623 (1995). https://doi.org/10.1006/jsco.1995.1035MathSciNetCrossRefzbMATHGoogle Scholar
 12.Kitahara, A., Sakai, M., Toyama, Y.: On the modularity of confluent term rewriting systems with shared constructors. Tech. Rep. Inf. Process. Soc. Jpn. 95(15), 11–20 (1995). (in Japanese)Google Scholar
 13.Klop, J., Middeldorp, A., Toyama, Y., de Vrijer, R.: Modularity of confluence: a simplified proof. Inf. Process. Lett. 49, 101–109 (1994). https://doi.org/10.1016/00200190(94)900345MathSciNetCrossRefzbMATHGoogle Scholar
 14.Nagele, J., Felgenhauer, B., Middeldorp, A.: CSI: new evidence – a progress report. In: de Moura, L. (ed.) CADE 2017. LNCS (LNAI), vol. 10395, pp. 385–397. Springer, Cham (2017). https://doi.org/10.1007/9783319630465_24CrossRefGoogle Scholar
 15.Nipkow, T., Wenzel, M., Paulson, L.C. (eds.): Isabelle/HOL. LNCS, vol. 2283. Springer, Heidelberg (2002). https://doi.org/10.1007/3540459499CrossRefzbMATHGoogle Scholar
 16.Ohlebusch, E.: Modular properties of composable term rewriting systems. Ph.D. thesis, Universität Bielefeld (1994)Google Scholar
 17.Oostrom, V.: Modularity of confluence. In: Armando, A., Baumgartner, P., Dowek, G. (eds.) IJCAR 2008. LNCS (LNAI), vol. 5195, pp. 348–363. Springer, Heidelberg (2008). https://doi.org/10.1007/9783540710707_31CrossRefGoogle Scholar
 18.Thiemann, R., Sternagel, C.: Certification of termination proofs using CeTA. In: Berghofer, S., Nipkow, T., Urban, C., Wenzel, M. (eds.) TPHOLs 2009. LNCS, vol. 5674, pp. 452–468. Springer, Heidelberg (2009). https://doi.org/10.1007/9783642033599_31CrossRefGoogle Scholar
 19.Toyama, Y.: On the ChurchRosser property for the direct sum of term rewriting systems. J. ACM 34(1), 128–143 (1987). https://doi.org/10.1145/7531.7534MathSciNetCrossRefzbMATHGoogle Scholar
 20.Wenzel, M.: Isar—a generic interpretative approach to readable formal proof documents. In: Bertot, Y., Dowek, G., Théry, L., Hirschowitz, A., Paulin, C. (eds.) TPHOLs 1999. LNCS, vol. 1690, pp. 167–183. Springer, Heidelberg (1999). https://doi.org/10.1007/3540482563_12CrossRefGoogle Scholar
Copyright information
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.