figure a
figure b

1 Introduction

Termination analysis is one of the main tasks in program verification, and techniques and tools to analyze termination of term rewrite systems (TRSs) automatically have been studied for decades. While a direct application of classical reduction orderings is often too weak, these orderings can be used successfully within the dependency pair (DP) framework [3, 20]. This framework allows for modular termination proofs by decomposing the original termination problem into sub-problems whose termination can then be analyzed independently using different techniques. Thus, DPs are used in essentially all current termination tools for TRSs (e.g., AProVE [21], MuTerm [25], NaTT [46], TTT2 [33]). To allow certification of termination proofs with DPs, they have been formalized in several proof assistants and there exist several corresponding certification tools for termination proofs with DPs (e.g., CeTA [43]).

On the other hand, probabilistic programs are used to describe randomized algorithms and probability distributions, with applications in many areas, see, e.g., [23]. To use TRSs also for such programs, probabilistic term rewrite systems (PTRSs) were introduced in [4, 9, 10]. In the probabilistic setting, there are several notions of “termination”. In this paper, we mostly focus on analyzing almost-sure termination (AST), i.e., we want to prove automatically that the probability for termination is 1.

While there exist many automatic approaches to prove (P)AST of imperative programs on numbers (e.g., [2, 5, 11, 16, 22, 26,27,28, 36,37,38, 40]), there are only few automatic approaches for programs with complex non-tail recursive structure [8, 12, 13]. The approaches that are also suitable for algorithms on recursive data structures [7, 35, 45] are mostly specialized for specific data structures and cannot easily be adjusted to other (possibly user-defined) ones, or are not yet fully automated.

For innermost AST (i.e., AST restricted to rewrite sequences where one only evaluates at innermost positions), we recently presented an adaption of the DP framework which allows us to benefit from a similar modularity as in the non-probabilistic setting [29, 32]. Unfortunately, there is no such modular powerful approach available for full AST (i.e., AST when considering arbitrary rewrite sequences). Up to now, full AST of PTRSs can only be proved via a direct application of orderings [4, 29], but there is no corresponding adaption of dependency pairs. (As explained in [29], a DP framework to analyze full instead of innermost AST would be “considerably more involved”.) Indeed, also in the non-probabilistic setting, innermost termination is usually substantially easier to prove than full termination, see, e.g., [3, 20]. To lift innermost termination proofs to full rewriting, in the non-probabilistic setting, there exist several sufficient criteria which ensure that innermost termination implies full termination [24].

Up to now no such results were known in the probabilistic setting. Our paper presents the first sufficient criteria for PTRSs which ensure that AST coincide for full and innermost rewriting, and we also show similar results for other rewrite strategies like leftmost-innermost rewriting. We focus on criteria that can be checked automatically, so we can combine our results with the DP framework for proving innermost AST of PTRSs [29, 32]. In this way, we obtain a modular powerful technique that can also prove AST for full rewriting automatically.

We will also consider the stronger notion of positive almost-sure termination (PAST) [10, 42], which requires that the expected runtime is finite, and show that our criteria for the relationship between full and innermost probabilistic rewriting hold for PAST as well. In contrast to AST, PAST is not modular, i.e., the sequence of two programs that are PAST may yield a program that is not PAST (see, e.g., [27]). Therefore, up to now there is no variant of DPs that allows to prove PAST of PTRSs, but there only exist techniques to apply polynomial or matrix orderings directly [4].

We start with preliminaries on term rewriting in Sect. 2. Then we recapitulate PTRSs based on [4, 10, 14, 15, 29] in Sect. 3. In Sect. 4 we show that the properties of [24] that ensure equivalence of innermost and full termination do not suffice in the probabilistic setting and extend them accordingly. In particular, we show that innermost and full AST coincide for PTRSs that are non-overlapping and linear. This result also holds for PAST, as well as for strategies like leftmost-innermost evaluation. In Sect. 5 we show how to weaken the linearity requirement in order to prove full AST for larger classes of PTRSs. The implementation of our criteria in the tool AProVE is evaluated in Sect. 6. We refer to [30] for all proofs.

2 Preliminaries

We assume familiarity with term rewriting [6] and regard (possibly infinite) TRSs over a (possibly infinite) signature \(\varSigma \) and a set of variables \(\mathcal {V}\). Consider the TRS \(\mathcal {R}_{\textsf{d}}\) that doubles a natural number (represented by the terms \(\textsf{s}\) and \(\mathcal {O}\)) with the rewrite rules \(\textsf{d}(\textsf{s}(x)) \rightarrow \textsf{s}(\textsf{s}(\textsf{d}(x)))\) and \(\textsf{d}(\mathcal {O}) \rightarrow \mathcal {O}\) as an example. A TRS \(\mathcal {R}\) induces a rewrite relation \({\rightarrow _{\mathcal {R}}} \subseteq \mathcal {T}\left( \varSigma ,\mathcal {V}\right) \times \mathcal {T}\left( \varSigma ,\mathcal {V}\right) \) on terms where \(s \rightarrow _{\mathcal {R}} t\) holds if there is a position \(\pi \), a rule \(\ell \rightarrow r \in \mathcal {R}\), and a substitution \(\sigma \) such that \(s|_{\pi }=\ell \sigma \) and \(t = s[r\sigma ]_{\pi }\). A rewrite step \(s \rightarrow _{\mathcal {R}} t\) is an innermost rewrite step (denoted ) if all proper subterms of the used redex \(\ell \sigma \) are in normal form w.r.t. \(\mathcal {R}\) (i.e., they do not contain redexes themselves and thus, they cannot be reduced with \(\rightarrow _\mathcal {R}\)). For example, we have .

Let < be the prefix ordering on positions and let \(\le \) be its reflexive closure. Then for two parallel positions \(\tau \) and \(\pi \) we define \(\tau \prec \pi \) if we have \(i < j\) for the unique ij such that \(\chi .i \le \tau \) and \(\chi .j \le \pi \), where \(\chi \) is the longest common prefix of \(\tau \) and \(\pi \). An innermost rewrite step at position \(\pi \) is leftmost (denoted ) if there exists no redex at a position \(\tau \) with \(\tau \prec \pi \).

We call a TRS \(\mathcal {R}\) strongly (innermost/leftmost innermost) normalizing (SN / iSN / liSN) if \(\rightarrow _{\mathcal {R}}\) ( / ) is well founded. SN is also called “terminating” and iSN/liSN are called “innermost/leftmost innermost terminating”. If every term \(t \in \mathcal {T}\left( \varSigma ,\mathcal {V}\right) \) has a normal form (i.e., we have \(t \rightarrow _{\mathcal {R}}^* t'\) where \(t'\) is in normal form) then we call \(\mathcal {R}\) weakly normalizing (WN). Two terms st are joinable via \(\mathcal {R}\) (denoted \(s \downarrow _{\mathcal {R}} t\)) if there exists a term w such that \(s \rightarrow _{\mathcal {R}}^* w \leftarrow _{\mathcal {R}}^* t\). Two rules \(\ell _1 \rightarrow r_1, \ell _2 \rightarrow r_2 \in \mathcal {R}\) with renamed variables such that \(\mathcal {V}(\ell _1) \cap \mathcal {V}(\ell _2) = \varnothing \) are overlapping if there exists a non-variable position \(\pi \) of \(\ell _1\) such that \(\ell _1|_{\pi }\) and \(\ell _2\) are unifiable with a mgu \(\sigma \). If \((\ell _1 \rightarrow r_1) = (\ell _2 \rightarrow r_2)\), then we require that \(\pi \ne \varepsilon \). \(\mathcal {R}\) is non-overlapping (NO) if it has no overlapping rules. As an example, the TRS \(\mathcal {R}_{\textsf{d}}\) is non-overlapping. A TRS is left-linear (LL) (right-linear, RL) if every variable occurs at most once in the left-hand side (right-hand side) of a rule. A TRS is linear if it is both left- and right-linear. A TRS is non-erasing (NE) if in every rule, all variables of the left-hand side also occur in the right-hand side.

Next, we recapitulate the relations between iSN, SN, liSN, and WN in the non-probabilistic setting. We start with the relation between iSN and SN.

Counterexample 1

(Toyama’s Counterexample [44]). The TRS \(\mathcal {R}_1\) with the rules \(\textsf{f}(\textsf{a},\textsf{b},x) \rightarrow \textsf{f}(x,x,x)\), \(\textsf{g}\rightarrow \textsf{a}\), and \(\textsf{g}\rightarrow \textsf{b}\) is not SN since we have \(\textsf{f}(\textsf{a}, \textsf{b}, \textsf{g}) \rightarrow _{\mathcal {R}_1} \textsf{f}(\textsf{g}, \textsf{g}, \textsf{g}) \rightarrow _{\mathcal {R}_1} \textsf{f}(\textsf{a}, \textsf{g}, \textsf{g}) \rightarrow _{\mathcal {R}_1} \textsf{f}(\textsf{a}, \textsf{b}, \textsf{g}) \rightarrow _{\mathcal {R}_1} \ldots \) But the only innermost rewrite sequences starting with \(\textsf{f}(\textsf{a}, \textsf{b}, \textsf{g})\) are and , i.e., both reach normal forms in the end. Thus, \(\mathcal {R}_1\) is iSN as we have to rewrite the inner \(\textsf{g}\) before we can use the \(\textsf{f}\)-rule.

The first property known to ensure equivalence of SN and iSN is orthogonality. A TRS is orthogonal if it is non-overlapping and left-linear.

Theorem 2

(From iSN to SN (1), [41]). If a TRS \(\mathcal {R}\) is orthogonal, then \(\mathcal {R}\) is SN iff \(\mathcal {R}\) is iSN.

Then, in [24] it was shown that one can remove the left-linearity requirement.

Theorem 3

(From iSN to SN (2), [24]). If a TRS \(\mathcal {R}\) is non-overlapping, then \(\mathcal {R}\) is SN iff \(\mathcal {R}\) is iSN.

Finally, [24] also refined Thm. 3 further. A TRS \(\mathcal {R}\) is an overlay system (OS) if its rules may only overlap at the root position, i.e., \(\pi = \varepsilon \). For Ex. 1 one can see that the overlaps occur at non-root positions, i.e., \(\mathcal {R}_1\) is not an overlay system. Furthermore, a TRS is locally confluent (or weakly Church-Rosser, abbreviated WCR) if for all terms \(s,t_1,t_2\) such that the terms \(t_1\) and \(t_2\) are joinable. So \(\mathcal {R}_1\) is not WCR, as we have , but . If a TRS has both of these properties, then iSN and SN are again equivalent.

Theorem 4

(From iSN to SN (3), [24]). If a TRS \(\mathcal {R}\) is a locally confluent overlay system, then \(\mathcal {R}\) is SN iff \(\mathcal {R}\) is iSN.

Thm. 4 is stronger than Thm. 3 as every non-overlapping TRS is a locally confluent overlay system. We recapitulate the relation between WN and SN next.

Counterexample 5

Consider the TRS \(\mathcal {R}_2\) with the rules \(\textsf{f}(x) \rightarrow \textsf{b}\) and \(\textsf{a}\rightarrow \textsf{f}(\textsf{a})\). This TRS is not SN since we can always rewrite the inner \(\textsf{a}\) to get \(\textsf{a}\rightarrow _{\mathcal {R}_2} \textsf{f}(\textsf{a}) \rightarrow _{\mathcal {R}_2} \textsf{f}(\textsf{f}(\textsf{a})) \rightarrow _{\mathcal {R}_2} \ldots \), but it is WN since we can also rewrite the outer \(\textsf{f}(\ldots )\) before we use the \(\textsf{a}\)-rule twice, resulting in the term \(\textsf{b}\), which is a normal form. For the TRS \(\mathcal {R}_3\) with the rules \(\textsf{f}(\textsf{a}) \rightarrow \textsf{b}\) and \(\textsf{a}\rightarrow \textsf{f}(\textsf{a})\), the situation is similar.

The TRS \(\mathcal {R}_2\) from Ex. 5 is erasing and \(\mathcal {R}_3\) is overlapping. For TRSs with neither of those two properties, SN and WN are equivalent.

Theorem 6

(From WN to SN [24]). If a TRS \(\mathcal {R}\) is non-overlapping and non-erasing, then \(\mathcal {R}\) is SN iff \(\mathcal {R}\) is WN.

Finally, we look at the difference between rewrite strategies that use an ordering for parallel redexes like leftmost innermost rewriting compared to just innermost rewriting. It turns out that such an ordering does not interfere with termination at all.

Theorem 7

(From liSN to iSN [34]). For all TRSs \(\mathcal {R}\) we have that \(\mathcal {R}\) is iSN iff \(\mathcal {R}\) is liSN.

The relations between the different properties for non-probabilistic TRSs (given in Thm. 4, 6, and 7) are summarized below.

figure n

3 Probabilistic Term Rewriting

In this section, we recapitulate probabilistic TRSs [4, 10, 29]. In contrast to TRSs, a PTRS has finite multi-distributionsFootnote 1 on the right-hand sides of its rewrite rules.Footnote 2 A finite multi-distribution \(\mu \) on a set \(A \ne \varnothing \) is a finite multiset of pairs (p : a), where \(0 < p \le 1\) is a probability and \(a \in A\), such that \(\sum _{(p:a) \in \mu }p = 1\). \({\text {FDist}}(A)\) is the set of all finite multi-distributions on A. For \(\mu \in {\text {FDist}}(A)\), its support is the multiset \({\text {Supp}}(\mu )=\{a \mid (p:a)\in \mu \) for some \(p\}\). A probabilistic rewrite rule is a pair \(\ell \rightarrow \mu \in \mathcal {T}\left( \varSigma , \mathcal {V}\right) \times {\text {FDist}}(\mathcal {T}\left( \varSigma ,\mathcal {V}\right) )\) such that \(\ell \not \in \mathcal {V}\) and \(\mathcal {V}(r) \subseteq \mathcal {V}(\ell )\) for every \(r \in {\text {Supp}}(\mu )\). A probabilistic TRS (PTRS) is a (possibly infinite) set \(\mathcal {S}\) of probabilistic rewrite rules. Similar to TRSs, the PTRS \(\mathcal {S}\) induces a rewrite relation \({\rightarrow _{\mathcal {S}}} \subseteq \mathcal {T}\left( \varSigma ,\mathcal {V}\right) \times {\text {FDist}}(\mathcal {T}\left( \varSigma ,\mathcal {V}\right) )\) where \(s \rightarrow _{\mathcal {S}} \{p_1:t_1, \ldots , p_k:t_k\}\) if there is a position \(\pi \), a rule \(\ell \rightarrow \{p_1:r_1, \ldots , p_k:r_k\} \in \mathcal {S}\), and a substitution \(\sigma \) such that \(s|_{\pi }=\ell \sigma \) and \(t_j = s[r_j\sigma ]_{\pi }\) for all \(1 \le j \le k\). We call \(s \rightarrow _{\mathcal {S}} \mu \) an innermost rewrite step (denoted ) if all proper subterms of the used redex \(\ell \sigma \) are in normal form w.r.t. \(\mathcal {S}\). We have if the rewrite step at position \(\pi \) is leftmost (i.e., there is no redex at a position \(\tau \) with \(\tau \prec \pi \)). For example, the PTRS \(\mathcal {S}_{\textsf{rw}}\) with the only rule \(\textsf{g}\rightarrow \{\nicefrac {1}{2}:\textsf{c}(\textsf{g},\textsf{g}), \; \nicefrac {1}{2}:\bot \}\) corresponds to a symmetric random walk on the number of \(\textsf{g}\)-symbols in a term.

As in [4, 14, 15, 29], we lift \(\rightarrow _{\mathcal {S}}\) to a rewrite relation between multi-distributions in order to track all probabilistic rewrite sequences (up to non-determinism) at once. For any \(0 < p \le 1\) and any \(\mu \in {\text {FDist}}(A)\), let \(p \cdot \mu = \{ (p\cdot q:a) \mid (q:a) \in \mu \}\).

Definition 8

(Lifting). The lifting \({\rightrightarrows } \subseteq {\text {FDist}}(\mathcal {T}\left( \varSigma ,\mathcal {V}\right) ) \times {\text {FDist}}(\mathcal {T}\left( \varSigma ,\mathcal {V}\right) )\) of a relation \({\rightarrow } \subseteq \mathcal {T}\left( \varSigma ,\mathcal {V}\right) \times {\text {FDist}}(\mathcal {T}\left( \varSigma ,\mathcal {V}\right) )\) is the smallest relation with:

  • If \(t \in \mathcal {T}\left( \varSigma ,\mathcal {V}\right) \) is in normal form w.r.t. \(\rightarrow \), then \(\{1: t\} \rightrightarrows \{1:t\}\).

  • If \(t \rightarrow \mu \), then \(\{1: t\} \rightrightarrows \mu \).

  • If for all \(1 \le j \le k\) there are \(\mu _j, \nu _j \in {\text {FDist}}(\mathcal {T}\left( \varSigma ,\mathcal {V}\right) )\) with \(\mu _j \rightrightarrows \nu _j\) and \(0 < p_j \le 1\) with \(\sum _{1 \le j \le k} p_j = 1\), then \(\bigcup _{1 \le j \le k} p_j \cdot \mu _j \rightrightarrows \bigcup _{1 \le j \le k} p_j \cdot \nu _j\).

For a PTRS \(\mathcal {S}\), we write , , and for the liftings of \(\rightarrow _{\mathcal {S}}\), , and , respectively.

Example 9

For example, we obtain the following \(\mathrel {\rightrightarrows _{\mathcal {S}_{\textsf{rw}}}}\)-rewrite sequence (which is also a -sequence, but not a -sequence).

$$ \begin{array}{ll} &{}\{1:\textsf{g}\}\\ \mathrel {\rightrightarrows _{\mathcal {S}_{\textsf{rw}}}}&{}\{\nicefrac {1}{2}:\textsf{c}(\textsf{g},\textsf{g}), \nicefrac {1}{2}:\bot \}\\ \mathrel {\rightrightarrows _{\mathcal {S}_{\textsf{rw}}}}&{} \{\nicefrac {1}{4}:\textsf{c}(\textsf{c}(\textsf{g},\textsf{g}),\textsf{g}),\nicefrac {1}{4}:\textsf{c}(\bot ,\textsf{g}), \nicefrac {1}{2}:\bot \}\\ \mathrel {\rightrightarrows _{\mathcal {S}_{\textsf{rw}}}}&{} \{ \nicefrac {1}{8}:\textsf{c}(\textsf{c}(\textsf{g},\textsf{g}),\textsf{c}(\textsf{g},\textsf{g})), \nicefrac {1}{8}:\textsf{c}(\textsf{c}(\textsf{g},\textsf{g}),\bot ), \nicefrac {1}{8}:\textsf{c}(\bot ,\textsf{c}(\textsf{g},\textsf{g})), \nicefrac {1}{8}:\textsf{c}(\bot ,\bot ), \nicefrac {1}{2}:\bot \} \end{array} $$

To express the concept of almost-sure termination, one has to determine the probability for normal forms in a multi-distribution.

Definition 10

(\(|\mu |_{\mathcal {S}}\)). For a PTRS \(\mathcal {S}\), \(\texttt{NF}_{\mathcal {S}} \subseteq \mathcal {T}\left( \varSigma ,\mathcal {V}\right) \) denotes the set of all normal forms w.r.t. \(\mathcal {S}\). For any \(\mu \in {\text {FDist}}(\mathcal {T}\left( \varSigma ,\mathcal {V}\right) )\), let \(|\mu |_{\mathcal {S}} = \sum _{(p:t) \in \mu , t \in \texttt{NF}_{\mathcal {S}}} p\).

Example 11

Consider \(\{ \nicefrac {1}{8}:\textsf{c}(\textsf{c}(\textsf{g},\textsf{g}),\textsf{c}(\textsf{g},\textsf{g})), \nicefrac {1}{8}:\textsf{c}(\textsf{c}(\textsf{g},\textsf{g}),\bot ), \nicefrac {1}{8}:\textsf{c}(\bot ,\textsf{c}(\textsf{g},\textsf{g})),\nicefrac {1}{8}:\textsf{c}(\bot ,\bot ), \nicefrac {1}{2}:\bot \} = \mu \) from Ex. 9. Then \(|\mu |_{\mathcal {S}_{\textsf{rw}}} = \nicefrac {1}{8} + \nicefrac {1}{2} = \nicefrac {5}{8}\), since \(\textsf{c}(\bot ,\bot )\) and \(\bot \) are both normal forms w.r.t. \(\mathcal {S}_{\textsf{rw}}\).

Definition 12

(AST). Let \(\mathcal {S}\) be a PTRS and \(\vec {\mu } = (\mu _n)_{n \in \mathbb {N}}\) be an infinite \(\rightrightarrows _{\mathcal {S}}\)-rewrite sequence, i.e., \(\mu _n \rightrightarrows _{\mathcal {S}} \mu _{n+1}\) for all \(n \in \mathbb {N}\). We say that \(\vec {\mu }\) converges with probability \(\lim \limits _{n \rightarrow \infty }|\mu _n|_{\mathcal {S}}\). \(\mathcal {S}\) is almost-surely terminating (AST) (innermost AST (iAST) / leftmost innermost AST (liAST)) if \(\lim \limits _{n \rightarrow \infty } |\mu _n|_{\mathcal {S}} = 1\) holds for every infinite - (- / -) rewrite sequence \((\mu _n)_{n \in \mathbb {N}}\). To highlight the consideration of AST for full (instead of innermost) rewriting, we also speak of full AST (fAST) instead of “AST”. We say that \(\mathcal {S}\) is weakly AST (wAST) if for every term t there exists an infinite -rewrite sequence \((\mu _n)_{n \in \mathbb {N}}\) with \(\lim \limits _{n \rightarrow \infty } |\mu _n|_{\mathcal {S}} = 1\) and \(\mu _0 = \{1:t\}\).

Example 13

For every infinite extension \((\mu _n)_{n \in \mathbb {N}}\) of the \(\mathrel {\rightrightarrows _{\mathcal {S}_{\textsf{rw}}}}\)-rewrite sequence in Ex. 9, we have \(\lim \limits _{n \rightarrow \infty } |\mu _n|_{\mathcal {S}} = 1\). Indeed, \(\mathcal {S}_{\textsf{rw}}\) is fAST and thus also iAST, liAST, and wAST.

Next, we define positive almost-sure termination that considers the expected derivation length \({\text {edl}}(\vec {\mu })\) of a rewrite sequence \(\vec {\mu }\), i.e., the expected number of steps until one reaches a normal form. For PAST, we require that the expected derivation lengths of all possible rewrite sequences are finite. In the following definition, \((1 - |\mu _n|_{\mathcal {S}})\) is the probability of terms that are not in normal form w.r.t. \(\mathcal {S}\) after the n-th step.

Definition 14

(\({\text {edl}}\), PAST). Let \(\mathcal {S}\) be a PTRS and \(\vec {\mu } = (\mu _n)_{n \in \mathbb {N}}\) be an infinite \(\rightrightarrows _{\mathcal {S}}\)-rewrite sequence. By \({\text {edl}}(\vec {\mu }) = \sum _{n = 0}^{\infty } (1 - |\mu _n|_{\mathcal {S}})\) we denote the expected derivation length of \(\vec {\mu }\). \(\mathcal {S}\) is positively almost-surely terminating (PAST) (innermost PAST (iPAST) / leftmost innermost AST (liPAST)) if \({\text {edl}}(\vec {\mu })\) is finite for every infinite - (- / -) rewrite sequence \(\vec {\mu } = (\mu _n)_{n \in \mathbb {N}}\).Footnote 3 Again, we also speak of full PAST (fPAST) when considering PAST for the full rewrite relation \(\rightrightarrows _{\mathcal {S}}\). We say that \(\mathcal {S}\) is weakly PAST (wPAST) if for every term t there exists an infinite \(\rightrightarrows _{\mathcal {S}}\)-rewrite sequence \(\vec {\mu } = (\mu _n)_{n \in \mathbb {N}}\) such that \({\text {edl}}(\vec {\mu })\) is finite and \(\mu _0 = \{1:t\}\).

It is well known that PAST implies AST, but not vice versa.

Example 15

For every infinite extension \(\vec {\mu } = (\mu _n)_{n \in \mathbb {N}}\) of the \(\mathrel {\rightrightarrows _{\mathcal {S}_{\textsf{rw}}}}\)-rewrite sequence in Ex. 9, the expected derivation length \({\text {edl}}(\vec {\mu })\) is infinite, hence \(\mathcal {S}_{\textsf{rw}}\) is not PAST w.r.t. any of the strategies regarded in this paper.

In [4, 18], PAST was strengthened further to bounded or strong almost-sure termination (SAST). Indeed, our results on PAST can also be adapted to SAST (see [30]).

Many properties of TRSs from Sect. 2 can be lifted to PTRSs in a straightforward way: A PTRS \(\mathcal {S}\) is right-linear (non-erasing) iff the TRS \(\{\ell \rightarrow r \mid \ell \rightarrow \mu \in \mathcal {S}, r \in {\text {Supp}}(\mu )\}\) has the respective property. Moreover, all properties that just consider the left-hand sides, e.g., left-linearity, being non-overlapping, orthogonality, and being an overlay system, can be lifted to PTRSs directly as well, since their rules again only have a single left-hand side.

4 Relating Variants of AST

Our goal is to relate AST of full rewriting to restrictions of fAST, i.e., to iAST (Sect. 4.1), wAST (Sect. 4.2), and liAST (Sect. 4.3). More precisely, we want to find properties of PTRSs which are suitable for automated checking and which guarantee that two variants of AST are equivalent. Then for example, we can use existing tools that analyze iAST in order to prove fAST. Clearly, we have to impose at least the same requirements as in the non-probabilistic setting, as every TRS \(\mathcal {R}\) can be transformed into a PTRS \(\mathcal {S}\) by replacing every rule \(\ell \rightarrow r\) with \(\ell \rightarrow \{1:r\}\). Then \(\mathcal {R}\) is SN / iSN / liSN iff \(\mathcal {S}\) is fAST / iAST / liAST. While we mostly focus on AST, all results and counterexamples in this section also hold for PAST.

4.1 From iAST to fAST

Again, we start by analyzing the relation between iAST and fAST. The following example shows that Thm. 2 does not carry over to the probabilistic setting, i.e., orthogonality is not sufficient to ensure that iAST implies fAST.

Counterexample 16 (Orthogonality Does Not Suffice)

[Orthogonality Does Not Suffice] Consider the orthogonal PTRS \(\mathcal {S}_1\) with the two rules:

$$\begin{aligned} \textsf{g}&\rightarrow \{\nicefrac {3}{4}:\textsf{d}(\textsf{g}), \nicefrac {1}{4}:\bot \} \qquad \qquad \qquad \textsf{d}(x) \rightarrow \{1:\textsf{c}(x,x)\} \end{aligned}$$

This PTRS is not fAST (and thus, also not fPAST), as we have \(\{1:\textsf{g}\} \rightrightarrows _{\mathcal {S}_1}^2 \{\nicefrac {3}{4}:\textsf{c}(\textsf{g},\textsf{g}), \nicefrac {1}{4}:\bot \}\), which corresponds to a random walk biased towards non-termination (since \(\tfrac{3}{4} > \tfrac{1}{2}\)).

However, the \(\textsf{d}\)-rule can only duplicate normal forms in innermost evaluations. To see that \(\mathcal {S}_1\) is iPAST (and thus, also iAST), consider the following rewrite sequence \(\vec {\mu }\):

figure af

We can also view this rewrite sequence as a tree:

figure ag

The branch to the right that starts with \(\bot \) stops after 0 innermost steps, the branch that starts with \(\textsf{d}(\bot )\) stops after 1 innermost steps, the branch that starts with \(\textsf{d}(\textsf{d}(\bot ))\) stops after 2 innermost steps, and so on. So if we start with the term \(\textsf{d}^n(\bot )\), then we reach a normal form after n steps, and we reach \(\textsf{d}^n(\bot )\) after \(n+1\) steps from the initial term \(\textsf{g}\), where \(\textsf{d}^n(\bot ) = \underbrace{\textsf{d}(\ldots (\textsf{d}}_{n\text {-times}}(\bot ))\ldots )\). Hence, for every \(k \in \mathbb {N}\) we have \(|\mu _{2\cdot k + 1}|_{\mathcal {S}_1} = |\mu _{2 \cdot k + 2}|_{\mathcal {S}_1} = \sum _{n=0}^{k} \nicefrac {1}{4} \cdot (\nicefrac {3}{4})^n\) and thus

$$ \begin{array}{rclcl} {\text {edl}}(\vec {\mu }) &{}=&{} \sum \nolimits _{n = 0}^{\infty } (1 - |\mu _n|_{\mathcal {S}_1}) &{}=&{} 1 + 2 \cdot \sum \nolimits _{k \in \mathbb {N}} (1 - |\mu _{2\cdot k + 1}|_{\mathcal {S}_1}) \\ &{}=&{} 1 + 2 \cdot \sum \nolimits _{k \in \mathbb {N}} (1 - \sum \nolimits _{n=0}^{k} \nicefrac {1}{4} \cdot (\nicefrac {3}{4})^n) &{}=&{} 1 + 2 \cdot \sum \nolimits _{k \in \mathbb {N}} (\nicefrac {3}{4})^{k+1}\\ &{}=&{} (2 \cdot \sum \nolimits _{k \in \mathbb {N}} (\nicefrac {3}{4})^{k}) -1 &{}=&{} 7 \end{array} $$

Analogously, in all other innermost rewrite sequences, the \(\textsf{d}\)-rule can also only duplicate normal forms. Thus, all possible innermost rewrite sequences have finite expected derivation length. Therefore, \(\mathcal {S}_1\) is iPAST and thus, also iAST. The latter can also be proved automatically by our implementation of the probabilistic DP framework for iAST [29] in AProVE.

To construct a counterexample for AST of \(\mathcal {S}_1\), we exploited the fact that \(\mathcal {S}_1\) is not right-linear. Indeed, requiring right-linearity yields our desired result. For reasons of space, here we only give a proof sketch. As mentioned, all full proofs can be found in [30].

Theorem 17

(From iAST/iPAST to fAST/fPAST (1)). If a PTRS \(\mathcal {S}\) is orthogonal and right-linear (i.e., non-overlapping and linear), then:

$$\begin{aligned} \mathcal {S}\text { is fAST} &\Longleftrightarrow \mathcal {S}\text { is iAST}\\ \mathcal {S}\text { is fPAST} &\Longleftrightarrow \mathcal {S}\text { is iPAST}\! \end{aligned}$$

Proof Sketch

We only have to prove the non-trivial direction “\(\Longleftarrow \)”. The proofs for all theorems in this section (for both AST and PAST) follow a similar structure. We always iteratively replace rewrite steps by steps that use the desired strategy and ensure that this does not increase the probability of termination (resp. the expected derivation length). For this replacement, we lift the corresponding construction from the non-probabilistic to the probabilistic setting. However, this cannot be done directly but instead, we have to regard the “limit” of a sequence of transformation steps.

We first consider fAST and iAST. Let \(\mathcal {S}\) be a PTRS that is non-overlapping, linear, and not fAST. Thus, there exists an infinite rewrite sequence \(\vec {\mu } = (\mu _n)_{n \in \mathbb {N}}\) such that \(\lim _{n \rightarrow \infty } |\mu _n|_{\mathcal {S}} = c\) for some \(c \in \mathbb {R}\) with \(0 \le c < 1\). Our goal is to transform this sequence into an innermost sequence that converges at most with probability c. If the sequence is not yet an innermost one, then in \((\mu _n)_{n \in \mathbb {N}}\) at least one rewrite step is performed with a redex that is not an innermost redex. Since \(\mathcal {S}\) is non-overlapping, we can replace a first such non-innermost rewrite step with an innermost rewrite step using a similar construction as in the non-probabilistic setting. In this way, we result in a rewrite sequence \(\vec {\mu }^{(1)} = (\mu ^{(1)}_n)_{n \in \mathbb {N}}\) with \(\lim _{n \rightarrow \infty } |\mu ^{(1)}_n|_{\mathcal {S}} = \lim _{n \rightarrow \infty } |\mu _n|_{\mathcal {S}} = c\). Here, linearity is needed to ensure that the probability of termination does not increase during this replacement. We can then repeat this replacement for every non-innermost rewrite step, i.e., we again replace a first non-innermost rewrite step in \((\mu ^{(1)}_n)_{n \in \mathbb {N}}\) to obtain \((\mu ^{(2)}_n)_{n \in \mathbb {N}}\) with the same termination probability, etc. In the end, the limit of all these rewrite sequences \(\lim _{i \rightarrow \infty } (\mu ^{(i)}_n)_{n \in \mathbb {N}}\) is an innermost rewrite sequence that converges with probability at most \(c < 1\), and hence, the PTRS \(\mathcal {S}\) is not innermost AST.

For fPAST and iPAST, we start with an infinite rewrite sequence \(\vec {\mu }\) such that \({\text {edl}}(\vec {\mu }) = \infty \). Again, we replace the first non-innermost rewrite step with an innermost rewrite step using exactly the same construction as before to obtain \(\vec {\mu }^{(1)}\), etc., since \(\vec {\mu }^{(1)}\) does not only have the same termination probability as \(\vec {\mu }\), but we also have \({\text {edl}}(\vec {\mu }^{(1)}) \ge {\text {edl}}(\vec {\mu })\). In the end, the limit of all these rewrite sequences \(\lim _{i \rightarrow \infty }\vec {\mu }^{(i)}\) is an innermost rewrite sequence such that \({\text {edl}}(\lim _{i \rightarrow \infty }\vec {\mu }^{(i)}) \ge {\text {edl}}(\vec {\mu }) = \infty \), and hence, the PTRS \(\mathcal {S}\) is not innermost PAST.

One may wonder whether we can remove the left-linearity requirement from Thm. 17, as in the non-probabilistic setting. It turns out that this is not possible.

Counterexample 18 (Left-Linearity Cannot be Removed)

[Left-Linearity Cannot be Removed] Consider the PTRS \(\mathcal {S}_2\) with the rules:

$$\begin{aligned} \textsf{f}(x,x) &\rightarrow \{1:\textsf{f}(\textsf{a},\textsf{a})\} \qquad \qquad \qquad \textsf{a}\rightarrow \{\nicefrac {1}{2}:\textsf{b}, \nicefrac {1}{2}:\textsf{c}\} \end{aligned}$$

\(\mathcal {S}_2\) is not fAST (hence also not fPAST), since \(\{1:\textsf{f}(\textsf{a},\textsf{a})\} \rightrightarrows _{\mathcal {S}_2} \{1:\textsf{f}(\textsf{a},\textsf{a})\} \rightrightarrows _{\mathcal {S}_2} \ldots \) is an infinite rewrite sequence that converges with probability 0. However, it is iPAST (and hence, iAST) since the corresponding innermost sequence has the form . Here, the last distribution contains two normal forms \(\textsf{f}(\textsf{b},\textsf{c})\) and \(\textsf{f}(\textsf{c},\textsf{b})\) that did not occur in the previous rewrite sequence. Since all innermost rewrite sequences keep on adding such normal forms after a certain number of steps for each start term, they always have finite expected derivation length and thus, converge with probability 1 (again, iAST can be shown automatically by AProVE). Note that adding the requirement of being non-erasing would not help to get rid of the left-linearity either, as shown by the PTRS \(\mathcal {S}_3\) which results from \(\mathcal {S}_2\) by replacing the \(\textsf{f}\)-rule with \(\textsf{f}(x,x) \rightarrow \{1:\textsf{d}(\textsf{f}(\textsf{a},\textsf{a}), x)\}\).

The problem here is that although we rewrite both occurrences of \(\textsf{a}\) with the same rewrite rule, the two \(\textsf{a}\)-symbols are replaced by two different terms (each with a probability \(> 0\)). This is impossible in the non-probabilistic setting.

Next, one could try to adapt Thm. 4 to the probabilistic setting (when requiring linearity in addition). So one could investigate whether iAST implies fAST for PTRSs that are linear locally confluent overlay systems. A PTRS \(\mathcal {S}\) is locally confluent if for all multi-distributions \(\mu , \mu _1, \mu _2\) such that \(\mu _1 \leftleftarrows _{\mathcal {S}} \mu \rightrightarrows _{\mathcal {S}} \mu _2\), there exists a multi-distribution \(\mu '\) such that \(\mu _1 \rightrightarrows _{\mathcal {S}}^* \mu ' \leftleftarrows _{\mathcal {S}}^* \mu _2\), see [14]. Note that in contrast to the probabilistic setting, there are non-overlapping PTRSs that are not locally confluent (e.g., the variant \(\mathcal {S}_2'\) of \(\mathcal {S}_2\) that consists of the rules \(\textsf{f}(x,x) \rightarrow \{1:\textsf{d}\}\) and \(\textsf{a}\rightarrow \{\nicefrac {1}{2}:\textsf{b}, \nicefrac {1}{2}:\textsf{c}\}\), since we have \(\{1:\textsf{d}\} \leftleftarrows _{\mathcal {S}_2'} \{1:\textsf{f}(\textsf{a},\textsf{a})\} \rightrightarrows _{\mathcal {S}_2'} \{\nicefrac {1}{2}:\textsf{f}(\textsf{b},\textsf{a}), \nicefrac {1}{2}:\textsf{f}(\textsf{c},\textsf{a})\}\) and the two resulting multi-distributions are not joinable). Thus, such an adaption of Thm. 4 would not subsume Thm. 17.

In contrast to the proof of Thm. 2, the proof of Thm. 4 relies on a minimality requirement for the used redex. In the non-probabilistic setting, whenever a term t starts an infinite rewrite sequence, then there exists a position \(\pi \) of t such that there is an infinite rewrite sequence of t starting with the redex \(t|_\pi \), but no infinite rewrite sequence of t starting with a redex at a position \(\tau > \pi \) which is strictly below \(\pi \). In other words, if t starts an infinite rewrite sequence, then there is a “minimal” infinite rewrite sequence starting in t, i.e., as soon as one reduces a proper subterm of one of the redexes in the sequence, then one obtains a term which is terminating. However, such minimal infinite sequences do not always exist in the probabilistic setting.

Example 19

(No Minimal Infinite Rewrite Sequence for AST). Reconsider the PTRS \(\mathcal {S}_1\) from Ex. 16, which is not fAST. However, there is no “minimal” rewrite sequence with convergence probability \(< 1\) such that one rewrite step at a proper subterm of a redex would modify the multi-distribution in such a way that now only rewrite sequences with convergence probability 1 are possible. We have \(\{1:\textsf{g}\} \rightrightarrows _{\mathcal {S}_1} \{\nicefrac {3}{4}:\textsf{d}(\textsf{g}), \nicefrac {1}{4}:\bot \}\). In Ex. 16, we now alternated between the \(\textsf{d}\)- and the \(\textsf{g}\)-rule, resulting in a biased random walk, i.e., we obtained \(\{\nicefrac {3}{4}:\textsf{d}(\textsf{g}), \nicefrac {1}{4}:\bot \} \rightrightarrows _{\mathcal {S}_1} \{\nicefrac {3}{4}:\textsf{c}(\textsf{g},\textsf{g}), \nicefrac {1}{4}:\bot \} \rightrightarrows _{\mathcal {S}_1} \{\nicefrac {3}{4}:\textsf{c}(\textsf{d}(\textsf{g}),\textsf{g}), \nicefrac {1}{4}:\bot \} \rightrightarrows _{\mathcal {S}_1} \ldots \) The steps with the \(\textsf{d}\)-rule use redexes that have \(\textsf{g}\) as a proper subterm.

However, there does not exist any “minimal” non-fAST sequence. If we rewrite the proper subterm \(\textsf{g}\) of a redex \(\textsf{d}(\textsf{g})\), then this still yields a multi-distribution that is not fAST, i.e., it can still start a rewrite sequence with convergence probability \(< 1\). For example, we have \(\{\nicefrac {3}{4}:\textsf{d}(\textsf{g}), \nicefrac {1}{4}:\bot \} \rightrightarrows _{\mathcal {S}_1} \{(\nicefrac {3}{4})^2:\textsf{d}(\textsf{d}(\textsf{g})),\nicefrac {1}{4} \cdot \nicefrac {3}{4}:\textsf{d}(\bot ), \nicefrac {1}{4}:\bot \}\), but the obtained multi-distribution still contains the subterm \(\textsf{g}\), and thus, one can still continue the rewrite sequence in such a way that its convergence probability is \(< 1\). Again, the same example also shows that there is no “minimal” non-fPAST sequence.

It remains open whether one can also adapt Thm. 4 to the probabilistic setting (e.g., if one can replace non-overlappingness in Thm. 17 by the requirement of locally confluent overlay systems). There are two main difficulties when trying to adapt the proof of this theorem to PTRSs. First, the minimality requirement cannot be imposed in the probabilistic setting, as [2] discussed above. In the non-probabilistic setting, this requirement is needed to ensure that rewriting below a position that was reduced in the original (minimal) infinite rewrite sequence leads to a strongly normalizing rewrite sequence. Second, the original proof of Thm. 4 uses Newman’s Lemma [39] which states that local confluence implies confluence for strongly normalizing terms t, and thus it implies that t has a unique normal form. Local confluence and adaptions of the unique normal form property for the probabilistic setting have been studied in [14, 15], which concluded that obtaining an analogous statement to Newman’s Lemma for PTRSs that are AST (or PAST) would be very difficult. The reason is that one cannot use well-founded induction on the length of a rewrite sequence of a PTRS that is AST (or PAST), since these rewrite sequences may be infinite.

4.2 From wAST to fAST

Next, we investigate wAST. Since iAST implies wAST, we essentially have the same problems as for innermost AST, i.e., in addition to non-overlappingness, we need linearity, as seen in Ex. 16 and 18, as \(\mathcal {S}_1\) and \(\mathcal {S}_3\) are iAST (and hence wAST) but not fAST, while they are non-overlapping and non-erasing, but not linear. Furthermore, we need non-erasingness as we did in the non-probabilistic setting for the same reasons, see Ex. 5.

Theorem 20

(From wAST/wPAST to fAST/fPAST). If a PTRS \(\mathcal {S}\) is non-overlapping, linear, and non-erasing, then

$$\begin{aligned} \mathcal {S}\text { is fAST} &\Longleftrightarrow \mathcal {S}\text { is wAST}\\ \mathcal {S}\text { is fPAST} &\Longleftrightarrow \mathcal {S}\text { is wPAST}\! \end{aligned}$$

4.3 From liAST to fAST

Finally, we look at leftmost-innermost AST as an example for a rewrite strategy that uses an ordering for parallel redexes. In contrast to the non-probabilistic setting, it turns out that liAST and iAST are not equivalent in general. The counterexample is similar to Ex. 18, which illustrated that fAST and iAST are not equivalent without left-linearity.

Counterexample 21

Consider the PTRS \(\mathcal {S}_4\) with the five rules:

figure ai

This PTRS is not iAST (and hence not iPAST) since there exists the infinite rewrite sequence , which converges with probability 0. It first “splits” the term \(\textsf{f}(\textsf{a},\textsf{b})\) with the \(\textsf{b}\)-rule, and then applies one of the two different \(\textsf{a}\)-rules to each of the resulting terms. In contrast, when applying a leftmost innermost rewrite strategy, we have to decide which \(\textsf{a}\)-rule to use. For example, we have . Here, [2] the second term \(\textsf{f}(\textsf{c}_1,\textsf{d}_2)\) is a normal form. Since all leftmost innermost rewrite sequences keep on adding such normal forms after a certain number of steps for each start term, the PTRS is liAST (and also liPAST).

The counterexample above can easily be adapted to variants of innermost rewriting that impose different orders on parallel redexes like, e.g., rightmost innermost rewriting.

However, liAST and iAST are again equivalent for non-overlapping TRSs. For such TRSs, at most one rule can be used to rewrite at a given position, which prevents the problem illustrated in Ex. 21.

Theorem 22

(From liAST/liPAST to iAST/iPAST). If a PTRS \(\mathcal {S}\) is non-overlapping, then

$$\begin{aligned} \mathcal {S}\text { is iAST} &\Longleftrightarrow \mathcal {S}\text { is liAST}\\ \mathcal {S}\text { is iPAST} &\Longleftrightarrow \mathcal {S}\text { is liPAST}\! \end{aligned}$$

The relations between the different properties for AST of PTRSs (given in Thm. 17, 20, and 22) are summarized below. An analogous figure also holds for PAST.

figure al

5 Improving Applicability

In this section, we improve the applicability of Thm. 17, which relates fAST and iAST. The results of Sect. 5.1 allow us to remove the requirement of left-linearity by modifying the rewrite relation to simultaneous rewriting. Then in Sect. 5.2 we show that the requirement of right-linearity can be weakened to spareness if one only considers rewrite sequences that start with basic terms.

5.1 Removing Left-Linearity by Simultaneous Rewriting

First, we will see that we do not need to require left-linearity if we allow the simultaneous reduction of several copies of identical redexes. For a PTRS \(\mathcal {S}\), this results in the notion of simultaneous rewriting, denoted . While over-approximates , existing techniques for proving iAST [29, 32] (except for the rewriting processorFootnote 4) do not distinguish between both notions of rewriting, i.e., these techniques even prove that every rewrite sequence with the lifting of converges with probability 1. So for non-overlapping and right-linear PTRSs, these techniques can be used to prove innermost almost-sure termination w.r.t. , which then implies fAST. The following example illustrates our approach for handling non-left-linear PTRSs by applying the same rewrite rule at parallel positions simultaneously.

Example 23

(Simultaneous Rewriting). Reconsider the PTRS \(\mathcal {S}_2\) from Ex. 18 with the rules \(\textsf{f}(x,x) \rightarrow \{1:\textsf{f}(\textsf{a},\textsf{a})\}\) and \(\textsf{a}\rightarrow \{\nicefrac {1}{2}:\textsf{b}, \nicefrac {1}{2}:\textsf{c}\}\) which is iAST, but not fAST. Our new rewrite relation allows us to reduce several copies of the same redex simultaneously, so that we get , i.e., this -sequence converges with probability 0 and thus, \(\mathcal {S}_2\) is not iAST w.r.t. . Note that we simultaneously reduced both occurrences of \(\textsf{a}\) in the first step.

Definition 24

(Simultaneous Rewriting). Let \(\mathcal {S}\) be a PTRS. A term s rewrites simultaneously to a multi-distribution \(\mu = \{p_1:t_1, \ldots , p_k:t_k\}\) (denoted ) if there is a non-empty set of parallel positions \(\varPi \), a rule \(\ell \rightarrow \{p_1:r_1, \ldots , p_k:r_k\} \in \mathcal {S}\), and a substitution \(\sigma \) such that \(s|_{\pi }=\ell \sigma \) and \(t_j = s[r_j\sigma ]_{\pi }\) for every position \(\pi \in \varPi \) and for all \(1 \le j \le k\). We call an innermost simultaneous rewrite step (denoted ) if all proper subterms of the redex \(\ell \sigma \) are in normal form w.r.t. \(\mathcal {S}\).

Clearly, if the set of positions \(\varPi \) from Def. 24 is a singleton, then the resulting simultaneous rewrite step is an “ordinary” probabilistic rewrite step, i.e., and .

Corollary 25

(From to \(\rightarrow _{\mathcal {S}}\)). If \(\mathcal {S}\) is fAST (iAST) w.r.t. , i.e., every infinite - (resp. -) rewrite sequence converges with probability 1, then \(\mathcal {S}\) is fAST (iAST). Analogously, if \(\mathcal {S}\) is fPAST (iPAST) w.r.t. , i.e., every infinite - (resp. -) rewrite sequence has finite expected derivation length, then \(\mathcal {S}\) is fPAST (iPAST).

However, the converse of Cor. 25 does not hold. Ex. 23 shows that allows for rewrite sequences that are not possible with , and the following example shows the same for and \(\rightarrow _{\mathcal {S}}\).

Counterexample 26

Consider the PTRS \(\overline{\mathcal {S}}_2\) with the three rules:

figure bn

This PTRS is fAST. But as in Ex. 23, we have , i.e., there are rewrite sequences with and thus, also with that converge with probability 0. Hence, \(\overline{\mathcal {S}}_2\) is not iAST or fAST w.r.t. . Again, the same example also shows that fPAST and fPAST w.r.t. simultaneous rewriting are not equivalent either.

Note that this kind of simultaneous rewriting is different from the “ordinary” parallelism used for non-probabilistic rewriting, which is typically denoted by \(\rightarrow _{||}\). There, one may reduce multiple parallel redexes in a single rewrite step. Here, we do not only allow reducing multiple redexes, but in addition we “merge” the corresponding terms [2] in the multi-distributions that result from rewriting the different redexes. Because of this merging, we only allow the simultaneous reduction of equal redexes, whereas “ordinary” parallel rewriting allows the simultaneous reduction of arbitrary parallel redexes. For example, for \(\mathcal {S}_2\) from Ex. 18 we have , whereas using ordinary parallel rewriting we would get .

The following theorem shows that indeed, we do not need to require left-linearity when moving from iAST/iPAST w.r.t. to fAST/fPAST w.r.t. \(\rightarrow _{\mathcal {S}}\).

Theorem 27

(From iAST/iPAST to fAST/fPAST (2)). If a PTRS \(\mathcal {S}\) is non-overlapping and right-linear, then

figure bv

Proof Sketch

We use an analogous construction as for the proof of Thm. 17, but in addition, if we replace a non-innermost rewrite step by an innermost one, then we check whether in the original rewrite sequence, the corresponding innermost redex is “inside” the substitution used for the non-innermost rewrite step. In that case, if this rewrite step applied a non-left-linear rule, then we identify all other (equal) innermost redexes and use to rewrite them simultaneously (as we did for the innermost redex \(\textsf{a}\) in Ex. 23).

Note that Ex. 26 shows that the direction “\(\implies \)” does not hold in Thm. 27. The following example shows that right-linearity in Thm. 27 cannot be weakened to the requirement that \(\mathcal {S}\) is non-duplicating (i.e., that no variable occurs more often in a term on the right-hand side of a rule than on its left-hand side).

Counterexample 28 (Non-Duplicating Does Not Suffice)

[Non-Duplicating Does Not Suffice] Let \(\textsf{d}(\textsf{f}(\textsf{a},\textsf{a})^3)\) abbreviate \(\textsf{d}(\textsf{f}(\textsf{a},\textsf{a}),\textsf{f}(\textsf{a},\textsf{a}),\textsf{f}(\textsf{a},\textsf{a}))\). Consider the PTRS \(\mathcal {S}_5\) with the four rules:

figure bx

\(\mathcal {S}_5\) is not fAST (and thus, also not fPAST), since the infinite rewrite sequence \({\small \{1 : \textsf{f}(\textsf{a},\textsf{a})\} \rightrightarrows _{\mathcal {S}_5} \{1:\textsf{g}(\textsf{a},\textsf{a})\} \rightrightarrows _{\mathcal {S}_5}^2 \{\nicefrac {1}{4}:\textsf{g}(\textsf{b},\textsf{b}), \nicefrac {1}{4}:\textsf{g}(\textsf{b},\textsf{c}), \nicefrac {1}{4}:\textsf{g}(\textsf{c},\textsf{b}), \nicefrac {1}{4}:\textsf{g}(\textsf{c},\textsf{c})\} \rightrightarrows _{\mathcal {S}_5}^2 \{\nicefrac {1}{4}:\textsf{g}(\textsf{b},\textsf{b}), \nicefrac {1}{4}:\textsf{d}(\textsf{f}(\textsf{a},\textsf{a})^3), \nicefrac {1}{4}:\textsf{d}(\textsf{f}(\textsf{a},\textsf{a})^3), \nicefrac {1}{4}:\textsf{g}(\textsf{c},\textsf{c})\}}\) can be seen as a biased random walk on the number of \(\textsf{f}(\textsf{a},\textsf{a})\)-subterms that is not AST. However, for every innermost evaluation with or we have to rewrite the inner \(\textsf{a}\)-symbols first. Afterwards, the \(\textsf{f}\)-rule can only be used on redexes \(\textsf{f}(t,t)\) where the resulting term \(\textsf{g}(t,t)\) is a normal form. Thus, \(\mathcal {S}_5\) is iPAST (and hence, iAST) w.r.t. .

Note that for wAST, the direction of the implication in Cor. 25 is reversed, since wAST requires that for each start term, there exists an infinite rewrite sequence that is almost-surely terminating, whereas fAST requires that all infinite rewrite sequences are almost-surely terminating. Thus, if there exists an infinite \(\rightrightarrows _\mathcal {S}\)-rewrite sequence that converges with probability 1 (showing that \(\mathcal {S}\) is wAST), then this is also a valid -rewrite sequence that converges with probability 1 [2] (showing that \(\mathcal {S}\) is wAST w.r.t. ).

Corollary 29

(From \(\rightarrow _{\mathcal {S}}\) to for wAST/wPAST). If \(\mathcal {S}\) is wAST (wPAST), then \(\mathcal {S}\) is wAST (wPAST) w.r.t. .

One may wonder whether simultaneous rewriting could also be used to improve Thm. 20 by removing the requirement of left-linearity, but Ex. 30 shows this is not possible.

Counterexample 30

Consider the non-left-linear PTRS \(\mathcal {S}_6\) with the two rules:

$$\begin{aligned} \textsf{g}&\rightarrow \{\nicefrac {3}{4}:\textsf{d}(\textsf{g},\textsf{g}), \nicefrac {1}{4}:\bot \} \qquad \qquad \textsf{d}(x,x) \rightarrow \{1:x\} \end{aligned}$$

This PTRS is not fAST (and thus, also not fPAST), as we have \(\{1:\textsf{g}\} \rightrightarrows _{\mathcal {S}_6} \{\nicefrac {3}{4}:\textsf{d}(\textsf{g},\textsf{g}), \nicefrac {1}{4}:\bot \}\), which corresponds to a random walk biased towards non-termination if we never use the \(\textsf{d}\)-rule (since \(\tfrac{3}{4} > \tfrac{1}{2}\)). However, if we always use the \(\textsf{d}\)-rule directly after the \(\textsf{g}\)-rule, then we essentially end up with a PTRS whose only rule is \(\textsf{g}\rightarrow \{\nicefrac {3}{4}:\textsf{c}(\textsf{g}), \nicefrac {1}{4}:\bot \}\), which corresponds to flipping a biased coin until heads comes up. This proves that \(\mathcal {S}_6\) is wPAST and hence, also wAST. As \(\mathcal {S}_6\) is non-overlapping, right-linear, and non-erasing, this shows that a variant of Thm. 20 without the requirement of left-linearity needs more than just moving to simultaneous rewriting.

5.2 Weakening Right-Linearity to Spareness

To improve our results further, we introduce the notion of spareness. The idea of spareness is to require that variables which occur non-linear in right-hand sides may only be instantiated by normal forms. We already used spareness for non-probabilistic TRSs in [17] to find classes of TRSs where innermost and full runtime complexity coincide. For a PTRS \(\mathcal {S}\), we decompose its signature \(\varSigma = \varSigma _{C} \uplus \varSigma _{D}\) such that \(f \in \varSigma _D\) iff \(f = {\text {root}}(\ell )\) for some rule \(\ell \rightarrow \mu \in \mathcal {S}\). The symbols in \(\varSigma _{C}\) and \(\varSigma _{D}\) are called constructors and defined symbols, respectively.

Definition 31

(Spareness). Let \(\ell \rightarrow \mu \in \mathcal {S}\). A rewrite step \(\ell \sigma \rightarrow _\mathcal {S}\mu \sigma \) is spare if \(\sigma (x)\) is in normal form w.r.t. \(\mathcal {S}\) for every \(x \in \mathcal {V}\) that occurs more than once in some \(r \in {\text {Supp}}(\mu )\). A \(\rightrightarrows _{\mathcal {S}}\)-sequence is spare if each of its \(\rightarrow _\mathcal {S}\)-steps is spare. \(\mathcal {S}\) is spare if each \(\rightrightarrows _{\mathcal {S}}\)-sequence that starts with \(\{1 : t\}\) for a basic term t is spare. A term \(t \in \mathcal {T}\left( \varSigma ,\mathcal {V}\right) \) is basic if \(t = f(t_1, \ldots , t_n)\) such that \(f \in \varSigma _D\) and \(t_i \in \mathcal {T}\left( \varSigma _C,\mathcal {V}\right) \) for all \(1 \le i \le n\).

Example 32

Consider the PTRS \(\mathcal {S}_7\) with the two rules:

$$\begin{aligned} \textsf{g}&\rightarrow \{\nicefrac {3}{4}:\textsf{d}(\bot ), \nicefrac {1}{4}:\textsf{g}\} \qquad \qquad \textsf{d}(x) \rightarrow \{1:\textsf{c}(x,x)\} \end{aligned}$$

It is similar to the PTRS \(\mathcal {S}_1\) from Ex. 16, but we exchanged the symbols \(\textsf{g}\) and \(\bot \) in the right-hand side of the \(\textsf{g}\)-rule. This PTRS is orthogonal but duplicating due to the \(\textsf{d}\)-rule. However, in any rewrite sequence that starts with \(\{1:t\}\) for a basic term t we can only duplicate the constructor symbol \(\bot \) but no defined symbol. Hence, \(\mathcal {S}_7\) is [2] spare.

In general, it is undecidable whether a PTRS is spare, since spareness is already undecidable for non-probabilistic TRSs. However, there exist computable sufficient conditions for spareness, see [17].

If a PTRS is spare, and we start with a basic term, then we will only duplicate normal forms with our duplicating rules. This means that the duplicating rules do not influence the (expected) runtime and, more importantly for AST, the probability of termination. As in [17], which analyzed runtime complexity, we have to restrict ourselves to rewrite sequences that start with basic terms. So we only consider start terms where a single algorithm is applied to data, i.e., we may not have any nested defined symbols in our start terms. This leads to the following theorem, where “on basic terms” means that one only considers rewrite sequences that start with \(\{1:t\}\) for a basic term t. It can be proved by an analogous limit construction as in the proof of Thm. 17.

Theorem 33

(From iAST/iPAST to fAST/fPAST (3)). If a PTRS \(\mathcal {S}\) is orthogonal and spare, then

$$\begin{aligned} \mathcal {S}\text { is fAST on basic terms} &\Longleftrightarrow \mathcal {S}\text { is iAST on basic terms}\\ \mathcal {S}\text { is fPAST on basic terms} &\Longleftrightarrow \mathcal {S}\text { is iPAST on basic terms}\! \end{aligned}$$

While iAST on basic terms is the same as iAST in general, the requirement of basic start terms is real restriction for fAST, i.e., there exists PTRSs that are fAST on basic terms, but not fAST in general.

Counterexample 34

Consider the PTRS \(\mathcal {S}_8\) with the two rules:

$$\begin{aligned} \textsf{g}&\rightarrow \{\nicefrac {3}{4}:\textsf{s}(\textsf{g}), \nicefrac {1}{4}:\bot \} \qquad \qquad \textsf{f}(\textsf{s}(x)) \rightarrow \{1:\textsf{c}(\textsf{f}(x),\textsf{f}(x))\} \end{aligned}$$

This PTRS behaves similarly to \(\mathcal {S}_1\) (see Ex. 16). It is not fAST (and thus, also not fPAST), as we have \(\{1:\textsf{f}(\textsf{g})\} \rightrightarrows _{\mathcal {S}_8}^2 \{\nicefrac {3}{4}:\textsf{c}(\textsf{f}(\textsf{g}),\textsf{f}(\textsf{g})), \nicefrac {1}{4}:\textsf{f}(\bot )\}\), which corresponds to a random walk biased towards non-termination (since \(\tfrac{3}{4} > \tfrac{1}{2}\)).

However, the only basic terms for this PTRS are \(\textsf{g}\) and \(\textsf{f}(t)\) for terms t that do not contain \(\textsf{g}\) or \(\textsf{f}\). A sequence starting with \(\textsf{g}\) corresponds to flipping a biased coin and a sequence starting with \(\textsf{f}(t)\) will clearly terminate. Hence, \(\mathcal {S}_8\) is fAST (and even fPAST) on basic terms. Furthermore, note that \(\mathcal {S}_8\) is iPAST (and thus, also iAST) analogous to \(\mathcal {S}_1\). This shows that Thm. 33 cannot be extended to fAST or fPAST in general.

One may wonder whether Thm. 33 can nevertheless be used in order to prove fAST of a PTRS \(\mathcal {S}\) on all terms by using a suitable transformation from \(\mathcal {S}\) to another PTRS \(\mathcal {S}'\) such that \(\mathcal {S}\) is fAST on all terms iff \(\mathcal {S}'\) is fAST on basic terms.

There is an analogous difference in the complexity analysis of non-probabilistic term rewrite systems. There, the concept of runtime complexity is restricted to rewrite sequences that start with a basic term, whereas the concept of derivational complexity allows arbitrary start terms. In [19], a transformation was presented that extends any (non-probabilistic) TRS \(\mathcal {R}\) by so-called generator rules \(\mathcal {G}(\mathcal {R})\) such that the derivational complexity of \(\mathcal {R}\) is the same as the runtime complexity of \(\mathcal {R}\cup \mathcal {G}(\mathcal {R})\), where \(\mathcal {G}(\mathcal {R})\) are considered [2] to be relative rules whose rewrite steps do not “count” for the complexity. This transformation can indeed be reused to move from fAST on basic terms to fAST in general.

Lemma 35

A PTRS \(\mathcal {S}\) is fAST iff \(\mathcal {S}\cup \mathcal {G}(\mathcal {S})\) is fAST on basic terms.

For every defined symbol f, the idea of the transformation is to introduce a new constructor symbol \(\textsf{cons}_f\) and for every function symbol f it introduces a new defined symbol \(\textsf{enc}_{f}\). As an example for \(\mathcal {S}_8\) from Ex. 32, then instead of starting with the non-basic term \(\textsf{c}(\textsf{g},\textsf{f}(\textsf{g}))\), we start with the basic term \(\textsf{enc}_{\textsf{c}}(\textsf{cons}_{\textsf{g}},\textsf{cons}_{\textsf{f}}(\textsf{cons}_{\textsf{g}}))\), its so-called basic variant. The new defined symbol \(\textsf{enc}_{\textsf{c}}\) is used to first build the term \(\textsf{c}(\textsf{g},\textsf{f}(\textsf{g}))\) at the beginning of the rewrite sequence, i.e., it converts all occurrences of \(\textsf{cons}_{f}\) for \(f \in \varSigma _D\) back into the defined symbol f, and then we can proceed as if we started with the term \(\textsf{c}(\textsf{g},\textsf{f}(\textsf{g}))\) directly. For this conversion, we need another new defined symbol \(\textsf{argenc}\) that iterates through the term and replaces all new constructors \(\textsf{cons}_{f}\) by the original defined symbol f. Thus, we define the generator rules as in [19] (just with trivial probabilities in the right-hand sides \(\ell \rightarrow \{1:r\}\)), since we do not need any probabilities during this initial construction of the original start term.

Definition 36

(Generator Rules \(\mathcal {G}(\mathcal {S})\)). Let \(\mathcal {S}\) be a PTRS over the signature \(\varSigma \). Its generator rules \(\mathcal {G}(\mathcal {S})\) are the following set of rules

$$\begin{aligned} & \{\textsf{enc}_f(x_1, \ldots , x_n) \rightarrow \{1: f(\textsf{argenc}(x_1), \ldots , \textsf{argenc}(x_n))\} \mid f \in \varSigma \}\\ \cup \;&\{\textsf{argenc}(\textsf{cons}_f(x_1, \ldots , x_n)) \rightarrow \{1: f(\textsf{argenc}(x_1), \ldots , \textsf{argenc}(x_n))\} \mid f \in \varSigma _D\}\\ \cup \; & \{\textsf{argenc}(f(x_1, \ldots , x_n)) \rightarrow \{1: f(\textsf{argenc}(x_1), \ldots , \textsf{argenc}(x_n))\} \mid f \in \varSigma _C\}, \end{aligned}$$

where \(x_1, \ldots , x_n\) are pairwise different variables and where the function symbols \(\textsf{argenc}\), \(\textsf{cons}_f\), and \(\textsf{enc}_f\) are fresh (i.e., they do not occur in \(\mathcal {S}\)). Moreover, we define \(\varSigma _{\mathcal {G}(\mathcal {S})} = \{\textsf{enc}_f \mid f \in \varSigma \} \cup \{\textsf{argenc}\} \cup \{\textsf{cons}_f \mid f \in \varSigma _D\}\).

Example 37

For the PTRS \(\mathcal {S}_8\) from Ex. 34, we obtain the following generator rules \(\mathcal {G}(\mathcal {S}_8)\):

figure cf

As mentioned, using the symbols \(\textsf{cons}_f\) and \(\textsf{enc}_f\), as in [19] every term over \(\varSigma \) can be transformed into a basic term over \(\varSigma \cup \varSigma _{\mathcal {G}(\mathcal {S})}\).

However, even if \(\mathcal {S}\) is spare, the PTRS \(\mathcal {S}\cup \mathcal {G}(\mathcal {S})\) is not guaranteed to be spare, although the generator rules themselves are [2] right-linear. The problem is that the generator rules include a rule like \(\textsf{enc}_{\textsf{f}}(x_1) \rightarrow \{1: \textsf{f}(\textsf{argenc}(x_1))\}\) where a defined symbol \(\textsf{argenc}\) occurs below the duplicating symbol \(\textsf{f}\) on the right-hand side. Indeed, while \(\mathcal {S}_8\) is spare, \(\mathcal {S}_8 \cup \mathcal {G}(\mathcal {S}_8)\) is not. For example, when starting with the basic term \(\textsf{enc}_{\textsf{f}}(\textsf{s}(\textsf{cons}_{\textsf{g}}))\), we have

$$ \begin{array}{rll} \{1:\textsf{enc}_{\textsf{f}}(\textsf{s}(\textsf{cons}_{\textsf{g}}))\} &{}\rightrightarrows _{\mathcal {G}(\mathcal {S}_8)}^2&{} \{1:\textsf{f}(\textsf{s}(\textsf{argenc}(\textsf{cons}_{\textsf{g}})))\}\\ &{}\rightrightarrows _{\mathcal {S}_8}&{} \{1:\textsf{c}(\textsf{f}(\textsf{argenc}(\textsf{cons}_{\textsf{g}})), \textsf{f}(\textsf{argenc}(\textsf{cons}_{\textsf{g}}))), \end{array} $$

where the last step is not spare. In general, \(\mathcal {S}\cup \mathcal {G}(\mathcal {S})\) is guaranteed to be spare if \(\mathcal {S}\) is right-linear. So we could modify Thm. 33 into a theorem which states that \(\mathcal {S}\) is fAST on all terms iff \(\mathcal {S}\cup \mathcal {G}(\mathcal {S})\) is iAST on basic terms (and thus, on all terms) for orthogonal and right-linear PTRSs \(\mathcal {S}\). However, this theorem would be subsumed by Thm. 17, where we already showed the equivalence of fAST and iAST if \(\mathcal {S}\) is orthogonal and right-linear. Indeed, our goal in Thm. 33 was to find a weaker requirement than right-linearity. Hence, such a transformational approach to move from fAST on all start terms to fAST on basic terms does not seem viable for Thm. 33.

Finally, we can also combine our results on simultaneous rewriting and spareness to relax both left- and right-linearity in case of basic start terms. The proof for the following theorem combines the proofs for Thm. 27 and Thm. 33.

Theorem 38

(From iAST/iPAST to fAST/fPAST (4)). If \(\mathcal {S}\) is non-overlapping and spare, then

figure cg

6 Conclusion and Evaluation

In this paper, we presented numerous new results on the relationship between full and restricted forms of AST, including several criteria for PTRSs such that innermost AST implies full AST. All of our results also hold for PAST, and all of our criteria are suitable for automation (for spareness, there exist sufficient conditions that can be checked automatically).

We implemented our new criteria in our termination prover AProVE [21]. For every PTRS, one can indicate whether one wants to analyze its termination behavior for all start terms or only for basic start terms. Up to now, AProVE’s main technique for termination analysis of PTRSs was the probabilistic DP framework from [29, 32] which however can only prove iAST. If one wants to analyze fAST for a PTRS \(\mathcal {S}\), then AProVE now first tries to prove that the conditions of Thm. 33 are satisfied if one is restricted to basic start terms, or that the conditions of Thm. 17 hold if one wants to consider arbitrary start terms. If this succeeds, then we can use the full probabilistic DP framework in order to prove iAST, which then implies fAST. Otherwise, we try to prove all conditions of Thm. 38 or Thm. 27, respectively. If this succeeds, then we can use most of the processors [2] from the probabilistic DP framework to prove iAST, which again implies fAST. If none of these theorems can be applied, then AProVE tries to prove fAST using a direct application of polynomial orderings [29]. Note that for AST w.r.t. basic start terms, Thm. 33 generalizes Thm. 17 and Thm. 38 generalizes Thm. 27, since right-linearity implies spareness.

For our evaluation, we compare the old AProVE without any of the new theorems (which only uses direct applications of polynomial orderings to prove fAST), to variants of AProVE where we activated each of the theorems individually, and finally to the new AProVE strategy explained above. The following diagram shows the theoretical subsumptions of each of these strategies for basic start terms, where an arrow from strategy A to strategy B means that B is strictly better than A.

figure ch

We used the benchmark set of 100 PTRSs from [32], and extended it by 15 new PTRSs that contain all the examples presented in this paper and some additional examples which illustrate the power of each strategy. AProVE can prove iAST for 93 of these 118 PTRSs. The following table shows for how many of these 93 PTRSs the respective strategy allows us to conclude fAST for basic start terms from AProVE’s proof of iAST.

old AProVE

Thm. 17

Thm. 27

Thm. 33

Thm. 38

new AProVE

36

48

44

58

56

61

From the 61 examples that we can solve by using both Thm. 33 and Thm. 38 in “new AProVE”, 5 examples (that are all right-linear) can only be solved by Thm. 33, 3 examples (where one is right-linear and the others only spare) can only be solved by Thm. 38, and 53 examples can be solved by both. If one considers arbitrary start terms, then the new AProVE can conclude fAST (using only Thm. 17 and Thm. 27) for 49 examples.

Currently, we only use the switch from full to innermost rewriting as a preprocessing step before applying the DP framework. As future work, we want to develop a processor within the DP framework that can perform this switch in a modular way. Then, the criteria of our theorems do not have to be required for the whole PTRS anymore, but just for specific sub-problems within the termination proof. This, however, requires developing a DP framework for fAST directly, which we will investigate in future work.

For details on our experiments, our collection of examples, and for instructions on how to run our implementation in AProVE via its web interface or locally, we refer to:

https://aprove-developers.github.io/InnermostToFullAST/

In addition, an artifact is available at [31].