1 Introduction

Recently several approaches [5, 10, 11, 18, 19, 25, 32] have been proposed to program with coinductive (coalgebraic) datatypes to support corecursion, that is, the ability of defining predicates or functions by structural recursion on non-well-founded datatypes. Such solutions are generally characterized by a strong dichotomy between inductive and coinductive definitions, the former being based on the notion of least fixed point, and the latter on that of greatest fixed point. Moreover, some proposals provide language abstractions to allow the programmer to interpret corecursive definitions not in the standard coinductive way. As a consequence, formal reasoning about programs that exploit such abstractions cannot be based on usual proof principles.

In this paper, we introduce a framework for interpreting recursive definitions as fixed points which are not necessarily the least, nor the greatest one. This allows formal reasoning in cases where the inductive and coinductive interpretation do not provide the intended meaning, or are mixed together.

To introduce the idea, let us consider the following recursive definitions of functions on lists of integers, with the meaning suggested by the name.

figure a

These definitions are written above in a widely-known programming language syntax (OCaml) for concreteness, but this is not relevant here: for such first-order functions, in most programming languages we can write analogous recursive definitions, and they are usually interpreted inductively. This means that, turning, more abstractly, such recursive definitions into meta-rules of an inference system, they are interpreted as the set of judgments which have a finite proof tree. For instance, the meta-rules for the judgment \(\textit{allPos}(l,b)\) are as follows:

\(\displaystyle \frac{}{\textit{allPos}(\varLambda ,T)}\,\,\,\displaystyle \frac{}{\textit{allPos}(x{:}l,F)}x\le 0\,\,\,\displaystyle \frac{\textit{allPos}(l,b)}{\textit{allPos}(x{:}l,b)}x >0\)

where \(\varLambda \) and  :  denote the empty list, and the list constructor, respectively, and T and F denote the boolean values. This interpretation works perfectly well on finite lists. However, with the inductive interpretation the above functions may happen to be undefined on infinite lists. For instance, the judgment \(\textit{allPos}(l,b)\) obviously has no finite proof tree if l is an infinite list of positives.

Indeed, to support structural recursion on non-well-founded structures, such as infinite lists or graphs, we typically have to use coinduction. The coinductive interpretation of an inference system is the set of judgments which have a (finite or infinite) proof tree.

In some cases, the coinductive interpretation actually yields the intended meaning. For instance, taking a slightly different version of allPos as a unary predicate \(\textit{allPos}(l)\), as it would be expressed in a logic program:

\(\displaystyle \frac{}{\textit{allPos}(\varLambda )}\,\,\,\displaystyle \frac{\textit{allPos}(l)}{\textit{allPos}(x:l)}x >0\)

it is easy to see that with the coinductive interpretation we obtain the intended meaning on infinite lists as well, since we get an infinite proof tree if and only if all the elements in the list are positive. Indeed, this interpretation has been fruitfully used in coinductive logic programming (coLP) [3, 31,32,33].

However, considering instead the previous relation \(\textit{allPos}(l,b)\), the coinductive interpretation fails to be a function, since for infinite lists of positives both the judgment \(\textit{allPos}(l,T)\) and \(\textit{allPos}(l,F)\) can be proved. Moreover, if we consider the predicate corresponding to the boolean function member:

$$\displaystyle \frac{}{\textit{member}(x,x:l)}\,\,\,\displaystyle \frac{\textit{member}(x,l)}{\textit{member}(x,y:l)}x\ne y$$

then the correct interpretation is the inductive one. Indeed, the coinductive interpretation contains all judgments \(\textit{member}(x,l)\) where l is an infinite list. Finally, for the predicates corresponding to the other example functions, which do not return a boolean, neither the inductive nor the coinductive interpretation yields the intended semantics. In particular, the coinductive interpretation contains too many elements. For instance, taking l the infinite list of 1s, by coinductively interpreting elems and maxElem we get, together with the correct judgments, also wrong ones, as will be formally shown in the following section.

All these examples suggest the idea that we should be able to “filter out” in some way the (infinite) proof trees corresponding to the coinductive interpretation, keeping only some of them. We make this possible by introducing coaxioms. A coaxiom is, intuitively, an axiom which can only be applied “at infinite depth” in a proof tree. An inference system interpreted inductively corresponds to a generalized inference system with no coaxioms, while an inference system interpreted coinductively corresponds to a generalized inference system where there is a coaxiom for each judgment.

From the model-theoretic point of view, coaxioms allow the programmer to choose the desired fixed point for a recursive definition, by selecting also fixed points which are neither the least, nor the greatest one. For instance, in the inference system for \(\textit{allPos}(l,b)\), the intended meaning is the set of judgments \(\textit{allPos}(l,b)\) where b is true if and only if the (finite or infinite) list l contains only positives. This set is a fixed point which lies between the least, which is undefined on infinite lists of positives, and the greatest, which returns both boolean values, hence is undetermined, on such lists.

Coaxioms are partly inspired by an extension of coLP and coinductive SLD resolution (coSLD) [31,32,33] with finally clauses [5], to allow more flexible interpretations of corecursive definitions of predicates, and by a related proposal in the context of object-oriented programming [10, 11]. In this paper we take a more abstract and general approach and provide a framework for interpreting corecursive definitions in a flexible way and to formally reason on their correctness.

The rest of the paper is organized as follows: in Sect. 2 we introduce the notion of generalized inference system with coaxioms, and show how to express the previous examples and others. In Sect. 3 we formally define the fixed point semantics of inference systems with coaxioms in the more general setting of complete lattices. In Sect. 4 we discuss the equivalent semantics based on the proof-theoretic approach, and in Sect. 5 we illustrate the related proof techniques on some of the examples. In Sect. 6 we show some more involved examples and discuss some subtleties, Sect. 7 surveys related work, and finally in Sect. 8 we summarize our contribution and discuss further work. A prototype meta-interpreterFootnote 1 has been developed to test the examples provided in Sects. 2 and 6.

2 Inference Systems with Coaxioms

We recall some standard notions about inference systems [1, 23].

Assume in the following a set \({\mathcal {U}}\) called the universe, whose elements are called judgments.

An inference system \(\mathcal {I}\) consists of a set of inference rules, which are pairs \({\displaystyle \frac{{\textit{Pr}}}{\textit{c}}}\), with \({\textit{Pr}}\subseteq {\mathcal {U}}\) the set of premises, \(\textit{c}\in {\mathcal {U}}\) the consequence.

The intuitive interpretation of a rule is that if the premises \({\textit{Pr}}\) hold then the consequence \(\textit{c}\) should hold as well. In particular, an axiom is (the consequence of) a rule with empty set of premises, which necessarily holds.

The (one step) inference operator \({\textit{F}_{\mathcal {I}}}:\wp ({\mathcal {U}})\rightarrow \wp ({\mathcal {U}})\) associated with an inference system \(\mathcal {I}\) is defined by:

\({\textit{F}_{\mathcal {I}}}(S)=\{\textit{c}\mid {\textit{Pr}}\subseteq S, \displaystyle \frac{{\textit{Pr}}}{\textit{c}}\in \mathcal {I}\}\)

That is, \({\textit{F}_{\mathcal {I}}}(S)\) is the set of judgments that can be inferred (in one step) from the judgments in S using the inference rules. Note that this set always includes axioms.

A set S is closed if \({\textit{F}_{\mathcal {I}}}(S)\subseteq S\), and consistent if \(S\subseteq {\textit{F}_{\mathcal {I}}}(S)\). That is, no new judgments can be inferred from a closed set, and all judgments in a consistent set can be inferred from the set itself.

The inductive interpretation of \(\mathcal {I}\), denoted \({\textit{Ind}}(\mathcal {I})\), is the smallest closed set, that is, the intersection of all closed sets, and the coinductive interpretation of \(\mathcal {I}\), denoted \({\textit{CoInd}}(\mathcal {I})\), is the largest consistent set, that is, the union of all consistent sets. Both interpretations are well-defined and can be equivalently expressed as the least (respectively, greatest) fixed point of the inference operator. Moreover, under continuity hypotheses on \({\textit{F}_{\mathcal {I}}}\), they can be computed as follows:

\({\textit{Ind}}(\mathcal {I})=\bigcup \{ \textit{F}^{n}_{\mathcal {I}}(\emptyset ) \mid n \ge 0\}\)

\({\textit{CoInd}}(\mathcal {I})=\bigcap \{ \textit{F}^{n}_{\mathcal {I}}({\mathcal {U}}) \mid n \ge 0\}\)

The inductive and coinductive interpretation can also be characterized in terms of proof trees. That is, defining a proof tree as a tree whose nodes are (labeled with) judgments in \({\mathcal {U}}\), and there is a node c with set of children \({\textit{Pr}}\) only if there exists a rule \({\displaystyle \frac{{\textit{Pr}}}{\textit{c}}}\), it can be shown [23] that \({\textit{Ind}}(\mathcal {I})\) and \({\textit{CoInd}}(\mathcal {I})\) are the sets of judgments which are the root of a finiteFootnote 2 and an arbitrary (finite or infinite) proof tree, respectively.

We introduce now our generalization.

An inference system with coaxioms is a pair \(({{\mathcal {I}},{\gamma }})\) consisting of an inference system \(\mathcal {I}\) and a set of coaxioms \(\gamma \), with \(\gamma \subseteq {\mathcal {U}}\). A coaxiom \(\textit{c}\) will be written \(\displaystyle \frac{\bullet }{\textit{c}}\), very much like an axiom, and analogously to an axiom it can be used as an initial assumption to derive other judgments. However, coaxioms will be used in a special way, explained in the following.

To illustrate the notion, we will consider an introductory example which computes the judgment \(\textit{n}{\mathop {\rightarrow }\limits ^{\star }}\mathcal{N}\) meaning that \(\mathcal{N}\) is the set of nodes reachable from a node \(\textit{n}\) of a given graph. Let us represent a graph by its set of nodes \(\textit{V}\) and a function \({\textit{adj}}\) which returns all the adjacent nodes. As usual, sets of rules can be expressed by a metarule with side conditions, and the same can be done for sets of coaxioms.

$$ \begin{array}{c} \displaystyle \frac{\textit{n}_1{\mathop {\rightarrow }\limits ^{\star }}\mathcal{N}_1\ \ldots \ \textit{n}_k{\mathop {\rightarrow }\limits ^{\star }}\mathcal{N}_k}{\textit{n}{\mathop {\rightarrow }\limits ^{\star }}\{\textit{n}\}\cup \mathcal{N}_1\cup \ldots \cup \mathcal{N}_k}{\textit{adj}}(\textit{n})=\{\textit{n}_1,\ldots ,\textit{n}_k\}{\quad }\displaystyle \frac{\bullet }{\textit{n}{\mathop {\rightarrow }\limits ^{\star }}\emptyset }\textit{n}\in \textit{V}\end{array} $$

For instance, in the case of a graph with nodes abc, with an arc from a into b and conversely, and c isolated, we would get the following metarules and coaxioms:

$$ \begin{array}{c} \displaystyle \frac{b{\mathop {\rightarrow }\limits ^{\star }}\mathcal{N}}{a{\mathop {\rightarrow }\limits ^{\star }}\{a\}\cup \mathcal{N}}{\quad }\displaystyle \frac{a{\mathop {\rightarrow }\limits ^{\star }}\mathcal{N}}{b{\mathop {\rightarrow }\limits ^{\star }}\{b\}\cup \mathcal{N}}{\quad }\displaystyle \frac{}{c{\mathop {\rightarrow }\limits ^{\star }}\{c\}} {\quad }\displaystyle \frac{\bullet }{a{\mathop {\rightarrow }\limits ^{\star }}\emptyset }{\quad }\displaystyle \frac{\bullet }{b{\mathop {\rightarrow }\limits ^{\star }}\emptyset }{\quad }\displaystyle \frac{\bullet }{c{\mathop {\rightarrow }\limits ^{\star }}\emptyset } \end{array} $$

If we interpret the metarules inductively (excluding the coaxioms), then we get only the judgment \(c{\mathop {\rightarrow }\limits ^{\star }}\{c\}\). In other words, a visit computing \(\textit{n}{\mathop {\rightarrow }\limits ^{\star }}\mathcal{N}\), like other judgments on graphs, should mark already encountered nodes to avoid non termination, since the graph structure is not well-founded. On the other hand, if we interpret the metarules coinductively (excluding again the coaxioms), then we get the correct judgments \(a{\mathop {\rightarrow }\limits ^{\star }}\{a,b\}\) and \(b{\mathop {\rightarrow }\limits ^{\star }}\{a,b\}\), but we also get the wrong judgments \(a{\mathop {\rightarrow }\limits ^{\star }}\{a,b,c\}\) and \(b{\mathop {\rightarrow }\limits ^{\star }}\{a,b,c\}\).

We define a different interpretation, called interpretation generated by the coaxioms and denoted \({\textit{Gen}}(\mathcal {I},\gamma )\), which takes into account the coaxioms in the following way.

  1. 1.

    First, we take the smallest closed superset of the set of coaxioms. In other words, we consider the inference system \({{\mathcal {I}_{{\sqcup }\gamma }}}\) obtained enriching \(\mathcal {I}\) by judgments in \(\gamma \) considered as axioms, and we take its inductive interpretation \({\textit{Ind}}({{\mathcal {I}_{{\sqcup }\gamma }}})\).

  2. 2.

    Then, we take the largest consistent subset of \({\textit{Ind}}({{\mathcal {I}_{{\sqcup }\gamma }}})\). In other words, we take the coinductive interpretation of the inference system obtained from \(\mathcal {I}\) by keeping only rules with consequence in \({\textit{Ind}}({{\mathcal {I}_{{\sqcup }\gamma }}})\), that is, we define

    \({\textit{Gen}}(\mathcal {I},\gamma )={\textit{CoInd}}({\mathcal {I}_{{\sqcap }{\textit{Ind}}({{\mathcal {I}_{{\sqcup }\gamma }}})}})\)

where \({\mathcal {I}_{{\sqcap }S}}\), with \(\mathcal {I}\) inference system and \(S\subseteq {\mathcal {U}}\), denotes the inference system obtained from \(\mathcal {I}\) by keeping only rules with consequence in S.

In the example, in the first phase we obtain the following judgments (each line corresponds to an iteration of the inference operator):

\(a{\mathop {\rightarrow }\limits ^{\star }}\emptyset \), \(b{\mathop {\rightarrow }\limits ^{\star }}\emptyset \), \(c{\mathop {\rightarrow }\limits ^{\star }}\emptyset \), \(c{\mathop {\rightarrow }\limits ^{\star }}\{c\}\)

\(a{\mathop {\rightarrow }\limits ^{\star }}\emptyset \), \(b{\mathop {\rightarrow }\limits ^{\star }}\emptyset \), \(c{\mathop {\rightarrow }\limits ^{\star }}\emptyset \), \(c{\mathop {\rightarrow }\limits ^{\star }}\{c\}\), \(a{\mathop {\rightarrow }\limits ^{\star }}\{a\}\), \(b{\mathop {\rightarrow }\limits ^{\star }}\{b\}\)

\(a{\mathop {\rightarrow }\limits ^{\star }}\emptyset \), \(b{\mathop {\rightarrow }\limits ^{\star }}\emptyset \), \(c{\mathop {\rightarrow }\limits ^{\star }}\emptyset \), \(c{\mathop {\rightarrow }\limits ^{\star }}\{c\}\), \(a{\mathop {\rightarrow }\limits ^{\star }}\{a\}\), \(b{\mathop {\rightarrow }\limits ^{\star }}\{b\}\), \(a{\mathop {\rightarrow }\limits ^{\star }}\{a,b\},b{\mathop {\rightarrow }\limits ^{\star }}\{a,b\}\)

The last set is closed, hence it is \({\textit{Ind}}({{\mathcal {I}_{{\sqcup }\gamma }}})\).

In the second phase, each iteration of the inference operator removes judgments which cannot be inferred from the previous step, that is, we get:

\(c{\mathop {\rightarrow }\limits ^{\star }}\{c\}\), \(a{\mathop {\rightarrow }\limits ^{\star }}\{a\}\), \(b{\mathop {\rightarrow }\limits ^{\star }}\{b\}\), \(a{\mathop {\rightarrow }\limits ^{\star }}\{a,b\},b{\mathop {\rightarrow }\limits ^{\star }}\{a,b\}\)

\(c{\mathop {\rightarrow }\limits ^{\star }}\{c\}\), \(a{\mathop {\rightarrow }\limits ^{\star }}\{a,b\}\), \(b{\mathop {\rightarrow }\limits ^{\star }}\{a,b\}\)

This last set is consistent, hence it is \({\textit{Gen}}(\mathcal {I},\gamma )\), and it is indeed the expected result.

Note that the inductive and coinductive interpretation can be obtained as special cases of the interpretation generated by coaxioms of an inference system, notably:

  • the inductive interpretation when the set of coaxioms is empty

  • the coinductive interpretation when the set of coaxioms is the universe.

In terms of proof trees, judgments in \({\textit{Gen}}(\mathcal {I},\gamma )\) are those which have an arbitrary (finite or infinite) proof tree t in the inference system \(\mathcal {I}\), whose nodes all have a finite proof tree in \({{\mathcal {I}_{{\sqcup }\gamma }}}\). Note that for nodes in t which are roots of a finite subtree this always holds (a finite proof tree in \(\mathcal {I}\) is a finite proof tree in \({{\mathcal {I}_{{\sqcup }\gamma }}}\) as well), hence the condition is only significant for nodes which are roots of an infinite path in the proof tree.

For instance, in the example, the judgment \(a{\mathop {\rightarrow }\limits ^{\star }}\{a,b\}\) has an infinite proof tree in \(\mathcal {I}\) where each node has a finite proof tree in \({{\mathcal {I}_{{\sqcup }\gamma }}}\), as shown below.

\(\displaystyle \frac{ \displaystyle \frac{\displaystyle \frac{\ldots }{a{\mathop {\rightarrow }\limits ^{\star }}\{a,b\}}}{b{\mathop {\rightarrow }\limits ^{\star }}\{a,b\}} }{a{\mathop {\rightarrow }\limits ^{\star }}\{a,b\}}\)    \(\displaystyle \frac{ \displaystyle \frac{\displaystyle \frac{}{a{\mathop {\rightarrow }\limits ^{\star }}\emptyset }}{b{\mathop {\rightarrow }\limits ^{\star }}\{b\}} }{a{\mathop {\rightarrow }\limits ^{\star }}\{a,b\}}\)    \(\displaystyle \frac{ \displaystyle \frac{\displaystyle \frac{}{b{\mathop {\rightarrow }\limits ^{\star }}\emptyset }}{a{\mathop {\rightarrow }\limits ^{\star }}\{a\}} }{b{\mathop {\rightarrow }\limits ^{\star }}\{a,b\}}\)

Moreover, there is another important property which will be proved in Sect. 4: if a judgment belongs to \({\textit{Gen}}(\mathcal {I},\gamma )\), then, for all \(n\ge 0\), it has a proof tree in the inference system \({{\mathcal {I}_{{\sqcup }\gamma }}}\) where coaxioms can only be used at depth greater than n.

For instance, in the example, it is easy to see that, for any n, we can obtain a finite proof tree for the judgment \(a{\mathop {\rightarrow }\limits ^{\star }}\{a,b\}\) in \({{\mathcal {I}_{{\sqcup }\gamma }}}\) where coaxioms are used at depth greater than n, as shown below.

\(\displaystyle \frac{ \displaystyle \frac{\displaystyle \frac{}{a{\mathop {\rightarrow }\limits ^{\star }}\emptyset }}{b{\mathop {\rightarrow }\limits ^{\star }}\{b\}} }{a{\mathop {\rightarrow }\limits ^{\star }}\{a,b\}}\)    \(\displaystyle \frac{ \displaystyle \frac{\displaystyle \frac{ \displaystyle \frac{}{b{\mathop {\rightarrow }\limits ^{\star }}\emptyset } }{a{\mathop {\rightarrow }\limits ^{\star }}\{a\}}}{b{\mathop {\rightarrow }\limits ^{\star }}\{a,b\}} }{a{\mathop {\rightarrow }\limits ^{\star }}\{a,b\}}\)    \(\displaystyle \frac{ \displaystyle \frac{\displaystyle \frac{ \displaystyle \frac{\displaystyle \frac{}{a{\mathop {\rightarrow }\limits ^{\star }}\emptyset }}{b{\mathop {\rightarrow }\limits ^{\star }}\{b\}} }{a{\mathop {\rightarrow }\limits ^{\star }}\{a,b\}}}{b{\mathop {\rightarrow }\limits ^{\star }}\{a,b\}} }{a{\mathop {\rightarrow }\limits ^{\star }}\{a,b\}}\)   ...

This last property motivates the name “coaxioms”. Indeed, dually to axioms, which can be used in the proof tree at every depth, including 0, coaxioms can only be used “at an infinite depth” in the proof tree. Therefore, coaxioms filter out undesired infinite proof trees; in other words, they bound from above the greatest fixed point corresponding to the semantics of the generalized inference system.

As a second example, we consider the definition of the first sets in a grammar. Let us represent a context-free grammar by its set of terminals T, its set of non-terminals N, and all the productions \(\textit{A}::=\beta _1\mid \ldots \mid \beta _n\) with left-hand side \(\textit{A}\), for each non-terminal \(\textit{A}\). Recall that, for each \(\alpha \in (T\cup N)^{+}\), we can define the set \({\textit{first}}(\alpha ) = \{\sigma \mid \sigma \in T, \alpha {\rightarrow ^\star }\sigma \beta \}\). Informally, \({\textit{first}}(\alpha )\) is the set of the initial terminal symbols of the strings which can be derived from a string \(\alpha \) in 0 or more steps.

The following inference system with coaxioms defines the judgment \({\textit{first}}(\alpha ,\mathcal{F})\), with \(\mathcal{F}\subseteq T\).

$$ \begin{array}{c} \displaystyle \frac{}{{\textit{first}}(\sigma \alpha ,\{\sigma \})}\sigma \in T{\quad }\displaystyle \frac{{\textit{first}}(\textit{A},\mathcal{F})}{{\textit{first}}(\textit{A}\alpha ,\mathcal{F})} \begin{array}{l} \textit{A}\in N\\ \textit{A}{\not \rightarrow ^\star }\epsilon \end{array} {\quad }\displaystyle \frac{{\textit{first}}(\textit{A},\mathcal{F})\,\,\,{\textit{first}}(\alpha ,\mathcal{F}')}{{\textit{first}}(\textit{A}\alpha ,\mathcal{F}\cup \mathcal{F}')} \begin{array}{l} \textit{A}\in N\\ \textit{A}{\rightarrow ^\star }\epsilon \end{array} \\[4ex] \displaystyle \frac{}{{\textit{first}}(\epsilon ,\emptyset )} {\quad }\displaystyle \frac{{\textit{first}}(\beta _1,\mathcal{F}_1)\ \ldots \ {\textit{first}}(\beta _n,\mathcal{F}_n)}{{\textit{first}}(\textit{A},\mathcal{F}_1\cup \ldots \cup \mathcal{F}_n)}\textit{A}::=\beta _1\mid \ldots \mid \beta _n {\quad }\displaystyle \frac{\bullet }{{\textit{first}}(\textit{A},\emptyset )}\textit{A}\in N \end{array} $$

The rules of the inference system correspond to the natural recursive definition of first. Note, in particular, that in a string of shape \(\textit{A}\alpha \), if the non-terminal \(\textit{A}\) is nullable, that is, we can derive from it the empty string, then the first set for \(\textit{A}\alpha \) should also include the first set for \(\alpha \).

As in the previous example on graphs, the problem with this recursive definition is that, since the non-terminals in a grammar can mutually refer to each other, the function defined by the inductive interpretation can be undefined. That is, a naive top-down implementation might not terminate. For this reason, first sets are typically computed by an imperative bottom-up algorithm, or the top-down implementation is corrected by marking already encountered non-terminals, analogously to what is done for visiting graphs. Again as in the previous example, the coinductive interpretation may fail to be a function, whereas, with the coaxioms, we get the expected result.

We express now as inference systems with coaxioms the recursive definitions of functions shown at the beginning of Sect. 1. Let \({\mathbb {Z}}\) denote the set of integers, and \({\mathbb {L}}\) the set of (finite and infinite) lists of integers.

The first example is the function which checks whether all the elements of a list are positive, expressed by judgments of shape \(\textit{allPos}(l,b)\) with \(l\in {\mathbb {L}}\) and \(b\in \{T,F\}\).

$$ \begin{array}{c} \displaystyle \frac{}{\textit{allPos}(\varLambda ,T)}{\quad }\displaystyle \frac{}{\textit{allPos}(x{:}l,F)}x\le 0{\quad }\displaystyle \frac{\textit{allPos}(l,b)}{\textit{allPos}(x{:}l,b)}x >0{\quad }\displaystyle \frac{\bullet }{\textit{allPos}(l,T)} \end{array} $$

With the coaxioms, we obtain the expected function also on infinite lists of positives: indeed, we only consider the infinite trees where the nodes have a finite proof tree in the inference system enriched by the coaxioms. In this way, the infinite tree where \(b=F\) is filtered out.

The function which checks whether an element belongs to a list, expressed by judgments of shape \({\textit{member}}(x,l,b)\) with \(x\in {\mathbb {Z}}\), \(l\in {\mathbb {L}}\) and \(b\in \{T,F\}\), is a very similar example, with the difference that the coaxioms map every list into false rather than true.

$$ \begin{array}{c} \displaystyle \frac{}{{\textit{member}}(x,\varLambda ,F)}{\quad }\displaystyle \frac{}{{\textit{member}}(x,x{:}l,T)}{\quad }\displaystyle \frac{{\textit{member}}(x,l,b)}{{\textit{member}}(x,y:l,b)}x\ne y{\quad }\displaystyle \frac{\bullet }{{\textit{member}}(x,l,F)} \end{array} $$

Analogously to the previous example, with the coaxioms we obtain the expected result also on infinite lists which do not contain the element.

The function which returns the set of the elements contained in a list is expressed by judgments of shape \({\textit{elems}}(l,{ xs })\), with \(l\in {\mathbb {L}}\) and \({ xs }\in \wp ({\mathbb {Z}})\).

$$ \begin{array}{c} \displaystyle \frac{}{{\textit{elems}}(\varLambda ,\emptyset )}{\quad }\displaystyle \frac{{\textit{elems}}(l,{ xs })}{{\textit{elems}}(x{:}l,\{x\}\cup xs)}{\quad }\displaystyle \frac{\bullet }{{\textit{elems}}(l,\emptyset )} \end{array} $$

In this case, the inductive interpretation gives the expected result only on finite lists, and the coinductive interpretation fails to be a function on infinite lists. For instance, for l the infinite list of 1s, any judgment \({\textit{elems}}(l,{ xs })\) with \(1 \in { xs }\) can be derived. Indeed, for any such judgment we can construct an infinite proof tree which is a chain of applications of the last metarule. With the coaxioms, we only consider the infinite trees where the node \({\textit{elems}}(l,{ xs })\) has a finite proof tree in the inference system enriched by the coaxioms. This is only true for \({ xs }=\{1\}\).

Note that coaxioms are needed to get the expected result not only on regular lists. Considering for example the infinite list 1 : 2 : 1 : 1 : 2 : 1 : 1 : 1 : 2 : ..., it is easy to see that the same reasoning holds.

Finally, the function which returns the greatest element contained in a (non-empty) list is expressed by judgments of shape \({\textit{max}}(l,x)\), with \(l\in {\mathbb {L}}\) and \(x\in {\mathbb {Z}}\).

$$ \begin{array}{c} \displaystyle \frac{}{{\textit{max}}(x{:}\varLambda ,x)}{\quad }\displaystyle \frac{{\textit{max}}(l,y)}{{\textit{max}}(x{:}l,z)}z=\max (x,y) {\quad }\displaystyle \frac{\bullet }{{\textit{max}}(x{:}l,x)} \end{array} $$

Analogously to the previous example, the coinductive interpretation fails to be a function (for instance, for l the infinite list of 1s, any judgment \({\textit{max}}(l,x)\) with \(x\ge 1\) can be derived), and the coaxioms “filter out” the wrong results.

3 Bounded Fixed Points

In this section, after recalling basic definitions, we define the bounded fixed point generated by an element, justifying its existence by the Knaster-Tarski theorem [34]. Then, we show that the interpretation generated by coaxioms of an inference system corresponds to a bounded fixed point in the powerset lattice. Finally, we provide a constructive characterization of bounded fixed points, again justified by a classical result (Kleene theorem). We refer to [22] for an history of these theorems with a number of good references.

In the following we assume a complete lattice \(({{L},{\le }})\) with top and bottom elements \(\top \) and \(\bot \), and meet and join operations \(\sqcap \) and \(\sqcup \). Moreover, we use and \(\bigsqcup \) to denote meet (greatest lower bound) and join (least upper bound) of a set, respectively.

Basic Definitions. Let \(\textit{F}:L\rightarrow L\), and \(x\in L\). Then, x is a pre-fixed point of \(\textit{F}\) iff \(\textit{F}(x)\le x\); x is a post-fixed point of \(\textit{F}\) iff \(x\le \textit{F}(x)\); and x is a fixed point of \(\textit{F}\) iff \(x=\textit{F}(x)\). Pre-fixed points will be also called closed, and post-fixed points will be also called consistent points. A function \(\textit{F}:L\rightarrow L\) is monotone if, for all \(x,y \in L\), \(x\le y \Rightarrow \textit{F}(x) \le \textit{F}(y)\).

In this general setting, the role of the universe is played by the top \(\top \) of L, that of the inference system by a monotone function \(\textit{F}\), and that of the co-axioms by a distinguished element \(\gamma \in L\), called generator.

Definition of Bounded Fixed-Point. In the following we assume a monotone function \(\textit{F}:L\rightarrow L\). The bounded fixed point generated by an element \(\gamma \) is the greatest fixed point of the monotone function obtained by restricting \(\textit{F}\) to the down-set of the least pre-fixed point above \(\gamma \). The construction is detailed and justified below. First of all we introduce two notations.

Definition 1

Let \(x \in L\). Then:

  • The closure of x w.r.t. \(\textit{F}\) is the element \(\nabla _\textit{F}(x)\) of L defined by

    .

  • The kernel of x w.r.t. \(\textit{F}\) is the element \(\varDelta _\textit{F}(x)\) of L defined by

    \(\varDelta _\textit{F}(x) = \bigsqcup \{ y \in L \mid y \le x,\, y \le \textit{F}(y)\}\).

We can also see \(\varDelta _\textit{F}\) and \(\nabla _\textit{F}\) as endofunctions on L, which are instances of well-known notions in lattice theory: closure and kernel operators.

From this definition immediately follows the bounded coinduction principle. Indeed, given \(\beta \in L\), we have:

  • (CoInd) If \(x \le \textit{F}(x)\) (x post-fixed), and \(x \le \beta \), then \(x \le \varDelta _\textit{F}(\beta )\).

The standard coinduction principle can be obtained as a specific instance of the more general principle above, by taking \(\beta =\top \); for this particular case the hypothesis \(x \le \beta \) can be omitted, since it trivially holds. We will show in detail how to use this proof principle in Sect. 5.

The closure of an arbitrary element \(\gamma \) turns out to be the best closed approximation of \(\gamma \), that is, the least pre-fixed point of \(\textit{F}\) above \(\gamma \), as shown below.

Proposition 1

Let \(\gamma \in L\). Then, \(z=\nabla _\textit{F}(\gamma )\) is the least pre-fixed point of \(\textit{F}\) above \(\gamma \).

Proof

Set \(S = \{x \in L \mid \gamma \le x,\, \textit{F}(x) \le x\}\). We have to prove that \(z\in S\), which then implies, by definition, that it is its least element. Since \(\gamma \) is a lower bound for all \(x\in S\), by definition of meet we get \(\gamma \le z\). We can show that z is a pre-fixed point of \(\textit{F}\) by the following steps:

  • for all \(x\in S\), \(\textit{F}(x) \le x\) (def. of S) and \(z \le x\) (def. of );

  • for all \(x\in S\), \(\textit{F}(x) \le x\) (def. of S) and \(\textit{F}(z)\le \textit{F}(x)\) (\(\textit{F}\) is monotone);

  • for all \(x\in S\), \(\textit{F}(z) \le x\) (transitivity);

  • \(\textit{F}(z) \le z\) (def. of ).

   \(\square \)

Note that if \(\gamma =\bot \) we have that \(\nabla _\textit{F}(\bot )\) is the least pre-fixed point of \(\textit{F}\), that, thanks to the Knaster-Tarski theorem, is the least fixed point of \(\textit{F}\).

The kernel of a pre-fixed point \(\beta \) turns out to be the greatest (post-)fixed-point of \(\textit{F}\) below \(\beta \), as shown below.

Proposition 2

Let \(\beta \in L\). If \(\beta \) is a pre-fixed point of \(\textit{F}\) and \(z = \varDelta _\textit{F}(\beta )\), then \(\textit{F}(z)=z\).

Proof

If \(\beta \) is an element of a complete lattice, then \(L_\beta = \{ x \in L \mid x \le \beta \}\) is also a complete lattice, with top element \(\beta \). If \(\beta \) is a pre-fixed point of \(\textit{F}\), then \(\textit{F}\) is a monotone endofunction on \(L_\beta \). Therefore, by the Knaster-Tarski theorem \(\textit{F}(z)=z\).

We can now define bounded fixed points generated by an element.

Definition 2

(Bounded fixed point). Let \(\gamma \in L\). The bounded fixed point of \(\textit{F}\) generated by \(\gamma \), denoted \({\textit{Gen}}(\textit{F},\gamma )\), is the greatest fixed point of \(\textit{F}\) below the closure of \(\gamma \), that is, \({\textit{Gen}}(\textit{F},\gamma ) = \varDelta _\textit{F}(\nabla _\textit{F}(\gamma ))\).

The bounded fixed point is well-defined since, thanks to Proposition 2, there exists the greatest fixed point below \(\beta \), provided that the bound is a pre-fixed point. Since in general \(\gamma \) might not be pre-fixed, we need to construct a pre-fixed point from \(\gamma \). Note that the first step of this construction cannot be expressed as the least fixed point of \(\textit{F}\) on the complete lattice \(\{ x \in L \mid x \ge \gamma \}\), since in general \(\textit{F}\) may fail to be an endofunction (e.g., if \(\textit{F}\) is the function which maps any element to \(\bot < \gamma \)). Indeed, \(\nabla _\textit{F}(\gamma )\) is not a fixed point in general, but only a pre-fixed point: we need the two steps to obtain a fixed point.

Note also that the definition of bounded fixed point is asymmetric, that is, we take the greatest fixed point bounded from above by a least (pre-)fixed point, rather than the other way round. This is motivated by the intuition, explained in the previous section, that we essentially need a greatest fixed point, since we want to deal with non-well-founded structures, but we want to “constrain” in some way such greatest fixed point. Investigating the symmetric construction is a matter of further work (see the Conclusion).

An important fact is that bounded fixed points are a generalization of both least and greatest fixed points, since they can be obtained by taking particular generators, as stated in the following proposition.

Proposition 3

  1. 1.

    \({\textit{Gen}}(\textit{F},\top )\) is the greatest fixed point of \(\textit{F}\)

  2. 2.

    \({\textit{Gen}}(\textit{F},\bot )\) is the least fixed point of \(\textit{F}\).

Proof

  1. 1.

    Note that \(\nabla _\textit{F}(\top )=\top \), since the only pre-fixed point above \(\top \) is \(\top \) itself, therefore we get \({\textit{Gen}}(\textit{F},\top ) = \varDelta _\textit{F}(\top )\), that is, the greatest fixed point of \(\textit{F}\), by Proposition 2.

  2. 2.

    As already noted \(\nabla _\textit{F}(\bot )\) is the least fixed point of \(\textit{F}\), in particular \(\nabla _\textit{F}(\bot )\) is post-fixed, therefore we get \({\textit{Gen}}(\textit{F},\bot ) = \varDelta _\textit{F}(\nabla _\textit{F}(\bot )) = \nabla _\textit{F}(\bot )\), namely it is the least fixed point of \(\textit{F}\).

   \(\square \)

Coaxioms as Generators. In Sect. 2 we have described two steps to construct \({\textit{Gen}}(\mathcal {I},\gamma )\), the interpretation generated by coaxioms \(\gamma \) of an inference system \(\mathcal {I}\).

  1. 1.

    First, we consider the inference system \({{\mathcal {I}_{{\sqcup }\gamma }}}\) obtained enriching \(\mathcal {I}\) by judgments in \(\gamma \) considered as axioms, and we take its inductive interpretation \({\textit{Ind}}({{\mathcal {I}_{{\sqcup }\gamma }}})\).

  2. 2.

    Then, we take the coinductive interpretation of the inference system obtained from \(\mathcal {I}\) by keeping only rules with consequence in \({\textit{Ind}}({{\mathcal {I}_{{\sqcup }\gamma }}})\), that is, we define

    \({\textit{Gen}}(\mathcal {I},\gamma )={\textit{CoInd}}({\mathcal {I}_{{\sqcap }{\textit{Ind}}({{\mathcal {I}_{{\sqcup }\gamma }}})}})\)

The definition of bounded fixed point is the formulation of these two steps in the general setting of complete lattices. Indeed, the inference operator \({\textit{F}_{\mathcal {I}}}\) is a monotone function on the complete lattice \(\wp ({\mathcal {U}})\) obtained by taking set inclusion as order, and specifying the coaxioms \(\gamma \) corresponds to fixing an arbitrary element of L as generator. To show the correspondence in a precise way, we give an alternative and equivalent characterization of closure.

Proposition 4

Let \(\gamma \in L\) and consider the function \({\textit{F}_{{\sqcup }\gamma }}:L\rightarrow L\) defined by \({\textit{F}_{{\sqcup }\gamma }} (x) = \textit{F}(x) \sqcup \gamma \), that is clearly monotone. Then, \(\nabla _{\textit{F}_{{\sqcup }\gamma }}(\bot ) = \nabla _\textit{F}(\gamma )\).

Proof

To prove the statement it is enough to show that \(y\in L\) is a pre-fixed point of \({\textit{F}_{{\sqcup }\gamma }}\) iff y is a pre-fixed point of \(\textit{F}\) and \(y\ge \gamma \). This trivially follows from the definition of \({\textit{F}_{{\sqcup }\gamma }}\) and \(\sqcup \), indeed \(\textit{F}(y) \sqcup \gamma = {\textit{F}_{{\sqcup }\gamma }}(y) \le y\) is equivalent to \(\textit{F}(y) \le y \) and \(\gamma \le y\).    \(\square \)

By this alternative characterization we can formally state the correspondence with the two steps for defining \({\textit{Gen}}(\mathcal {I},\gamma )\).

Theorem 1

Let \(\mathcal {I}\) be an inference system and \(\gamma , \beta \in \wp ({\mathcal {U}})\), with \(\beta \) closed w.r.t. \({\textit{F}_{\mathcal {I}}}\), then the following facts hold:

  1. 1.

    \({({\textit{F}_{\mathcal {I}}})_{{\sqcup }\gamma }} = {\textit{F}_{({\mathcal {I}_{{\sqcup }\gamma }})}}\) (so we can safely omit brackets)

  2. 2.

    \(\nabla _{\textit{F}_{\mathcal {I}}}(\gamma ) = {\textit{Ind}}({\mathcal {I}_{{\sqcup }\gamma }})\)

  3. 3.

    \(\varDelta _{\textit{F}_{\mathcal {I}}}(\beta ) = {\textit{CoInd}}({\mathcal {I}_{{\sqcap }\beta }})\).

Proof

  1. 1.

    We have to show that, for \(S \subseteq {\mathcal {U}}\), \({({\textit{F}_{\mathcal {I}}})_{{\sqcup }\gamma }}(S) = {\textit{F}_{({\mathcal {I}_{{\sqcup }\gamma }})}}(S)\). If \(\textit{c}\in {({\textit{F}_{\mathcal {I}}})_{{\sqcup }\gamma }}(S)\), then either \(\textit{c}\in \gamma \) or \(\textit{c}\in {\textit{F}_{\mathcal {I}}}(S)\); in the former case there exists \(\displaystyle \frac{}{\textit{c}} \in {\mathcal {I}_{{\sqcup }\gamma }}\) by definition, in the latter there exists \(\displaystyle \frac{{\textit{Pr}}}{\textit{c}} \in \mathcal {I}\) such that \({\textit{Pr}}\subseteq S\), and this implies \(\displaystyle \frac{{\textit{Pr}}}{\textit{c}} \in {\mathcal {I}_{{\sqcup }\gamma }}\). Therefore in both cases \(\textit{c}\in {\textit{F}_{({\mathcal {I}_{{\sqcup }\gamma }})}}(S)\).

    Conversely, if \(\textit{c}\in {\textit{F}_{({\mathcal {I}_{{\sqcup }\gamma }})}}(S)\), then there exists \(\displaystyle \frac{{\textit{Pr}}}{\textit{c}} \in {\mathcal {I}_{{\sqcup }\gamma }}\) such that \({\textit{Pr}}\subseteq S\). By definition of \({\mathcal {I}_{{\sqcup }\gamma }}\), either \(\displaystyle \frac{{\textit{Pr}}}{\textit{c}} \in \mathcal {I}\) or \(\textit{c}\in \gamma \) and \({\textit{Pr}}= \emptyset \), therefore in the former case \(\textit{c}\in {\textit{F}_{\mathcal {I}}}(S)\) and in the latter \(\textit{c}\in \gamma \), thus in both cases \(\textit{c}\in {({\textit{F}_{\mathcal {I}}})_{{\sqcup }\gamma }}(S)\).

  2. 2.

    By Proposition 4 we get that \(\nabla _{\textit{F}_{\mathcal {I}}}(\gamma ) = \nabla _{\textit{F}_{{\mathcal {I}_{{\sqcup }\gamma }}}}(\emptyset )\), that is, the least fixed point of \({\textit{F}_{{\mathcal {I}_{{\sqcup }\gamma }}}}\), thanks to statement (1) of this proposition and Proposition 2. Therefore, it corresponds to the inductive interpretation of the inference system \({\mathcal {I}_{{\sqcup }\gamma }}\), \({\textit{Ind}}({\mathcal {I}_{{\sqcup }\gamma }})\).

  3. 3.

    Let \(X = {\textit{CoInd}}({\mathcal {I}_{{\sqcap }\beta }})\), we have to show the two inclusions. First note that X is a post-fixed point w.r.t. \({\textit{F}_{\mathcal {I}}}\), indeed \(X \subseteq {\textit{F}_{{\mathcal {I}_{{\sqcap }\beta }}}}(X)\), by definition of the coinductive interpretation, and \({\textit{F}_{{\mathcal {I}_{{\sqcap }\beta }}}}(X) \subseteq {\textit{F}_{\mathcal {I}}}(X)\), since each \(\textit{c}\in {\textit{F}_{{\mathcal {I}_{{\sqcap }\beta }}}}(X) \) is the consequence of a rule \(\displaystyle \frac{{\textit{Pr}}}{\textit{c}} \in {\mathcal {I}_{{\sqcap }\beta }}\) and by construction of \({\mathcal {I}_{{\sqcap }\beta }}\), this rule is also a rule of \(\mathcal {I}\), therefore \(\textit{c}\in {\textit{F}_{\mathcal {I}}}(X)\). In addition \(\textit{c}\in \beta \) again by definition of \({\mathcal {I}_{{\sqcap }\beta }}\), thus \(X \subseteq \beta \), therefore by (CoInd) we get \(X\subseteq \varDelta _{\textit{F}_{\mathcal {I}}}(\beta )\).

    On the other hand \(\varDelta _{\textit{F}_{\mathcal {I}}}(\beta )\) is a post-fixed point of \({\textit{F}_{{\mathcal {I}_{{\sqcap }\beta }}}}\). To show this fact first we note that for each \(S \subseteq \beta \) we have \({\textit{F}_{\mathcal {I}}}(S) \subseteq {\textit{F}_{{\mathcal {I}_{{\sqcap }\beta }}}}(S)\), indeed if \(\textit{c}\in {\textit{F}_{\mathcal {I}}}(S)\) then there exists a rule \(\displaystyle \frac{{\textit{Pr}}}{\textit{c}} \in \mathcal {I}\) such that \({\textit{Pr}}\subseteq S\), moreover we have that \({{\textit{F}_{\mathcal {I}}}(S)} \subseteq {\textit{F}_{\mathcal {I}}}(\beta ) \subseteq \beta \) since \(\beta \) is closed, so \(\displaystyle \frac{{\textit{Pr}}}{\textit{c}} \in {\mathcal {I}_{{\sqcap }\beta }}\) that implies that \(\textit{c}\in {\textit{F}_{{\mathcal {I}_{{\sqcap }\beta }}}}\). Then, since \(\varDelta _{\textit{F}_{\mathcal {I}}}(\beta )\) is a post-fixed point of \({\textit{F}_{\mathcal {I}}}\) below \(\beta \), we get that \(\varDelta _{\textit{F}_{\mathcal {I}}}(\beta ) \subseteq {\textit{F}_{\mathcal {I}}}(\varDelta _{\textit{F}_{\mathcal {I}}}(\beta )) \subseteq {\textit{F}_{{\mathcal {I}_{{\sqcap }\beta }}}}(\varDelta _{\textit{F}_{\mathcal {I}}}(\beta ))\), so it is a post-fixed point. Therefore by the coinduction principle we get the other inclusion.

Thanks to Theorem 1, we can conclude that, given an inference system with coaxioms \(({{\mathcal {I}},{\gamma }})\):

$${\textit{Gen}}(\mathcal {I},\gamma )= {\textit{CoInd}}({\mathcal {I}_{{\sqcap }{\textit{Ind}}({\mathcal {I}_{{\sqcup }\gamma }})}}) = \varDelta _{\textit{F}_{\mathcal {I}}} (\nabla _{\textit{F}_{\mathcal {I}}}(\gamma )) = {\textit{Gen}}({\textit{F}_{\mathcal {I}}},\gamma )$$

That is, the interpretation generated by coaxioms \(\gamma \) of the inference system \(\mathcal {I}\) is exactly the bounded fixed point of \({\textit{F}_{\mathcal {I}}}\) generated by \(\gamma \).

Constructive Characterization of Bounded Fixed Point. The Kleene’s theorem states that, under continuity hypotheses on \(\textit{F}\), we can characterize its greatest fixed point as the greatest lower bound of the descending chain obtained by repeatedly applying \(\textit{F}\) to \(\top \). By considering this theorem for the sublattice obtained as down-set of the bound, we can obtain a constructive characterization of the bounded fixed point generated by an element.

We recall some basic definitions. A descending chain in L is a set \(C= \{ x_i \mid i \in \mathbb {N}\} \subseteq L\) such that, for each \(i \in \mathbb {N}\), \(x_i \ge x_{i+1}\). A function \(\textit{F}:L\rightarrow L\) preserves meet of descending chains if and only if, for all descending chains C in L, we have where \(\textit{F}(C) = \{ \textit{F}(x_i) \mid x_i \in C\}\).

Given a function \(\textit{F}:L\rightarrow L\) and an element \(\beta \in L\), set \( C _{{\textit{F}},{\beta }}=\{ \textit{F}^n(\beta ) \mid n \in \mathbb {N}\}\).

Proposition 5

Let \(\textit{F}:L\rightarrow L\) be a function that preserves meet of descending chains, and \(\beta \in L\) a pre-fixed point of \(\textit{F}\). Then:

  1. 1.

    \( C _{{\textit{F}},{\beta }}\) is a descending chain in L

  2. 2.

    , that is, is the greatest fixed point of \(\textit{F}\) below \(\beta \).

Proof

  1. 1.

    Since \(\textit{F}\) preserves meet of descending chains, it is monotone, therefore, since \(\beta \) is pre-fixed, we get the thesis.

  2. 2.

    If \(\beta \) is an element of a complete lattice, then \(L_\beta = \{ x \in L \mid x \le \beta \}\) is also a complete lattice, with top element \(\beta \). If \(\beta \) is a closed point of \(\textit{F}\), then \(\textit{F}\) is a monotone endofunction on \(L_\beta \) and it still preserves meet of descending chains. Therefore applying Kleene’s theorem to \(L_\beta \) we get the thesis.

   \(\square \)

Note that for this constructive characterization we need an additional hypothesis on \(\textit{F}\). Under this assumption, the result of Proposition 5 immediately applies to our construction, as stated in the following corollary.

Corollary 1

Let \(\textit{F}:L\rightarrow L\) be a function that preserves meet of descending chains, and \(\gamma \in L\). Set \(\beta =\nabla _\textit{F}(\gamma )\). Then .

Proof

By definition \({\textit{Gen}}(\textit{F},\gamma ) = \varDelta _\textit{F}(\beta )\). Since \(\beta \) is pre-fixed by Proposition 1 and \(\textit{F}\) preserves meet of descending chains, by Proposition 5 we get the thesis.    \(\square \)

The characterization introduced above is important, but requires stronger assumptions on the function \(\textit{F}\). We now state a weaker result that is often enough for proving soundness, as will be illustrated in Sect. 5.

Proposition 6

Let \(\textit{F}:L\rightarrow L\) be monotone and \(\beta \in L\) a pre-fixed point, then

$$\begin{aligned} \varDelta _\textit{F}(\beta ) = \varDelta _\textit{F}\left( \sqcap C _{{\textit{F}},{\beta }} \right) \end{aligned}$$

hence, in particular, .

Proof

Set . First of all we note that z is pre-fixed, indeed . We prove separately the two inequalities.

  • \(\varDelta _\textit{F}({z}) \le \varDelta _\textit{F}(\beta )\). By Proposition 2 \(\varDelta _\textit{F}(z)\) is a fixed point, so in particular it is a post-fixed point, below z, by definition of \(\sqcap \) we get \(z \le \beta \), so by transitivity \(\varDelta _\textit{F}({z}) \le \beta \). By \({(\text {C}\textsc {o}\text {I}\textsc {nd})}\) we get \(\varDelta _\textit{F}({z}) \le \varDelta _\textit{F}(\beta )\).

  • \(\varDelta _\textit{F}(\beta ) \le \varDelta _\textit{F}({z})\). By Proposition 2 \(\varDelta _\textit{F}(\beta )\) is a fixed point, so in particular a post-fixed point, below \(\beta \). We prove by arithmetic induction that \(\varDelta _\textit{F}(\beta ) \le \textit{F}^n(\beta )\) for all \(n\in \mathbb {N}\).

    • Base \(\varDelta _\textit{F}(\beta ) \le \textit{F}^0(\beta )=\beta \) already proved.

    • Induction Let us assume \(\varDelta _\textit{F}(\beta ) \le \textit{F}^n(\beta )\), so by monotonicity of \(\textit{F}\) we get \(\textit{F}(\varDelta _\textit{F}(\beta )) \le \textit{F}^{n+1}(\beta )\). Since \(\varDelta _\textit{F}(\beta )\) is a post-fixed point, we have that \(\varDelta _\textit{F}(\beta ) \le \textit{F}(\varDelta _\textit{F}(\beta ))\), therefore by transitivity we get \(\varDelta _\textit{F}(\beta ) \le \textit{F}^{n+1}(\beta )\).

    • By definition of \(\sqcap \) we get \(\varDelta _\textit{F}(\beta ) \le \sqcap C _{{\textit{F}},{\beta }} = z\), so by \({(\text {C}\textsc {o}\text {I}\textsc {nd})}\) we get \(\varDelta _\textit{F}(\beta ) \le \varDelta _\textit{F}({z})\).

Finally by anti-symmetry we get the equality.    \(\square \)

Another way to read the lemma above is that, given a bound \(\beta \), we obtain the same greatest fixed point if we take as bound \(\sqcap C _{{\textit{F}},{\beta }}\). Indeed from Proposition 6 and point 1 of Proposition 5 we can say more: given a bound \(\beta \) which is pre-fixed, we obtain the same greatest fixed point below \(\beta \) if we take as bound any element \(\textit{F}^n(\beta )\) of the descending chain.

4 Proof Trees

In this section we formally define several proof-theoretic characterizations of inference systems with coaxioms, and prove their equivalenceFootnote 3 with the model-theoretic characterization given in the previous section.

First of all we recall the standard definition of proof trees and proof-theoretic characterization of inference systems.

Definition 3

Given an inference system \(\mathcal {I}\), a proof tree in \(\mathcal {I}\) is a tree whose nodes are (labeled with) judgments in \({\mathcal {U}}\), and there is a node \(\textit{c}\) with set of children \({\textit{Pr}}\) only if there exists a rule \({\displaystyle \frac{{\textit{Pr}}}{\textit{c}}}\). If a proof tree t in \(\mathcal {I}\) has root \( j \), then we say that t is a proof tree for \( j \), or that \( j \) has proof tree t, in \(\mathcal {I}\).

Theorem 2

Given an inference system \(\mathcal {I}\), and a judgment \( j \in {\mathcal {U}}\),

  1. 1.

    \( j \in {\textit{CoInd}}(\mathcal {I})\) iff \( j \) has a proof tree in \(\mathcal {I}\)

  2. 2.

    \( j \in {\textit{Ind}}(\mathcal {I})\) iff \( j \) has a finite proof tree in \(\mathcal {I}\).

See [15, 23].

The first proof-theoretic characterization is based on the following theorem, which slightly generalizes the standard correspondence between proof trees in \(\mathcal {I}\) and the coinductive interpretation of \(\mathcal {I}\).

Theorem 3

Given an inference system \(\mathcal {I}\), and \(\beta \subseteq {\mathcal {U}}\) a closed set of judgments, we have that, for all \( j \in {\mathcal {U}}\), \( j \in \varDelta _{\textit{F}_{\mathcal {I}}}(\beta )\) iff there exists a proof tree t for \( j \) in \(\mathcal {I}\) such that each node of t is in \(\beta \).

Proof

By Theorem 1, \(\varDelta _{\textit{F}_{\mathcal {I}}} (\beta ) = \varDelta _{\textit{F}_{{\mathcal {I}_{{\sqcap }\beta }}}}({\mathcal {U}}) = {\textit{CoInd}}({\mathcal {I}_{{\sqcap }\beta }})\). Thanks to Theorem 2 (1), we get that \(j \in {\textit{CoInd}}({\mathcal {I}_{{\sqcap }\beta }})\) iff there exists a proof tree t for j in \({\mathcal {I}_{{\sqcap }\beta }}\). By Definition 3, each node of t is (labeled by) a consequence \(\textit{c}\) of a rule in \({\mathcal {I}_{{\sqcap }\beta }}\), that is, \(\textit{c}\in \beta \) by definition of \({\mathcal {I}_{{\sqcap }\beta }}\), and this implies the thesis.    \(\square \)

As a particular case, we get our first proof-theoretic characterization

Corollary 2

Given an inference system with coaxioms \(({{\mathcal {I}},{\gamma }})\) and a judgment \(j \in {\mathcal {U}}\), we have that \(j \in {\textit{Gen}}(\mathcal {I},\gamma )\) iff there exists a proof tree t for j in \(\mathcal {I}\) such that each node of t has a finite proof tree in \({\mathcal {I}_{{\sqcup }\gamma }}\).

Proof

By Theorem 1, \({\textit{Gen}}(\mathcal {I},\gamma )=\varDelta _{\textit{F}_{\mathcal {I}}}(\beta )\), with \(\beta = \nabla _{\textit{F}_{\mathcal {I}}}(\gamma )\). Thanks to Theorem 3, we get that, for all \( j \in {\mathcal {U}}\), \( j \in {\textit{Gen}}(\mathcal {I},\gamma )\) iff there exists a proof tree t for \( j \) in \(\mathcal {I}\) such that each node of t is in \(\beta \). Again by Theorem 1 we get that \(\beta = {\textit{Ind}}({\mathcal {I}_{{\sqcup }\gamma }})\), so by Theorem 2 (2) we get that a node \( j '\) of t is in \(\beta \) iff there exists a finite proof tree for \( j '\) in \({\mathcal {I}_{{\sqcup }\gamma }}\).    \(\square \)

For the second proof-theoretic characterization, we need to define approximated proof trees.

In the definition below, let us denote by \( j _t\) the root of tree t.

Definition 4

Given an inference system with coaxioms \(({{\mathcal {I}},{\gamma }})\), the sets \(\mathcal {T}_n\) of approximated proof trees of level n in \(({{\mathcal {I}},{\gamma }})\), for \(n\in \mathbb {N}\), are inductively defined as follows:

figure b

In other words, an approximated proof tree of level n in \(({{\mathcal {I}},{\gamma }})\) is a finite proof tree in \(({{\mathcal {I}},{\gamma }})\) where coaxioms can only be used at depth \(\ge \)n.

The following lemma states that approximated proof trees of level n correspond to the n-th element of the descending chain \( C _{{{\textit{F}_{\mathcal {I}}}},{\beta }}=\{ \textit{F}^{n}_{\mathcal {I}}(\beta ) \mid n \in \mathbb {N}\}\), with \(\beta =\nabla _{\textit{F}_{\mathcal {I}}}(\gamma )={\textit{Ind}}({\mathcal {I}_{{\sqcup }\gamma }})\).

Lemma 1

Given an inference system with coaxioms \(({{\mathcal {I}},{\gamma }})\), and a judgment \( j \in {\mathcal {U}}\), we have that, for all \(n \in \mathbb {N}\), \( j \in \textit{F}^{n}_{\mathcal {I}}(\nabla _{\textit{F}_{\mathcal {I}}}(\gamma ))\) iff \( j \) has an approximated proof tree of level n in \(({{\mathcal {I}},{\gamma }})\).

Proof

Let \(\beta \) be \(\nabla _{\textit{F}_{\mathcal {I}}}(\gamma )\). We prove the thesis by induction on n.

  • Base If \(n=0\), then, by Theorem 1, \(\beta = \nabla _{\textit{F}_{\mathcal {I}}}(\gamma )\) corresponds to the inductive interpretation of \({\mathcal {I}_{{\sqcup }\gamma }}\), therefore the equivalence holds by Theorem 2 (2).

  • Induction We assume the equivalence for n and prove it for \(n+1\). We prove separately the two implications.

  • \(\Rightarrow \) If \(\textit{c}\in \textit{F}^{n+1}_{\mathcal {I}}(\beta )\), then there exists \({\displaystyle \frac{{\textit{Pr}}}{\textit{c}}} \in \mathcal {I}\) such that \({\textit{Pr}}\subseteq \textit{F}^{n}_{\mathcal {I}}(\beta )\). Hence, by inductive hypothesis, each judgment in \({\textit{Pr}}\) has an approximated proof tree of level n, that is, \({{\textit{Pr}}} = \{ j _t \mid t \in \mathcal {T}\}\), with \(\mathcal {T}{\subseteq } \mathcal {T}_n\). Hence, \(t = \displaystyle \frac{\mathcal {T}}{\textit{c}}\) is a proof tree for \(\textit{c}\), and by definition, \(t\in \mathcal {T}_{n+1}\).

  • \(\Leftarrow \) If \(t \in \mathcal {T}_{n+1}\) is an approximated proof tree for \(\textit{c}\in {\mathcal {U}}\), then, by definition, there exists \({\displaystyle \frac{{\textit{Pr}}}{\textit{c}}} \in \mathcal {I}\) such that \(t = \displaystyle \frac{\mathcal {T}}{\textit{c}}\), \({{\textit{Pr}}} = \{ j _t \mid t \in \mathcal {T}\}\), and \(\mathcal {T}\subseteq \mathcal {T}_n\). By inductive hypothesis we have \({\textit{Pr}}\subseteq {\textit{F}_{\mathcal {I}}}^n (\beta )\), and, by definition of \({\textit{F}_{\mathcal {I}}}\), this implies \(\textit{c}\in \textit{F}^{n+1}_{\mathcal {I}}(\beta )\) as needed.

   \(\square \)

Corollary 3

Given an inference system with coaxioms \(({{\mathcal {I}},{\gamma }})\), and a judgment \( j \in {\mathcal {U}}\), the following are equivalent:

  1. 1.

    \(j \in {\textit{Gen}}(\mathcal {I},\gamma )\)

  2. 2.

    there exists a proof tree t for \( j \) in \(\mathcal {I}\) such that each node has an approximated proof tree of level n in \(({{\mathcal {I}},{\gamma }})\), for all \(n \in \mathbb {N}\).

Proof

By Theorem 1, Proposition 6, and Theorem 3, we get that, for all \( j \in {\mathcal {U}}\), \( j \in {\textit{Gen}}(\mathcal {I},\gamma )\) iff there exists a proof tree t for \( j \) in \(\mathcal {I}\) such that each node \( j '\) of t is in with \(\beta = \nabla _{\textit{F}_{\mathcal {I}}}(\gamma )\). By Lemma 1, iff has an approximated proof tree of level n, for all \(n \in \mathbb {N}\).    \(\square \)

If the hypotheses of Proposition 5 are satisfied, then we get a simpler equivalent proof-theoretic characterization.

Corollary 4

Given an inference system with coaxioms \(({{\mathcal {I}},{\gamma }})\), and a judgment \(j \in {\mathcal {U}}\), if \({\textit{F}_{\mathcal {I}}}\) preserves meet of descending chains, then the following are equivalent:

  1. 1.

    \(j \in {\textit{Gen}}(\mathcal {I},\gamma )\)

  2. 2.

    \( j \) has an approximated proof tree of level n in \(({{\mathcal {I}},{\gamma }})\), for all \(n \in \mathbb {N}\).

Proof

Let \(\beta \) be \(\nabla _{\textit{F}_{\mathcal {I}}}(\gamma )\). By Theorem 1 and Proposition 5, we get that , therefore the thesis follows immediately from Lemma 1.    \(\square \)

5 Reasoning with Coaxioms

In this section we discuss proof techniques for inference systems with coaxioms.

Assume that \(\textit{G}={\textit{Gen}}(\mathcal {I},\gamma )\) is the interpretation generated by coaxioms for some \(({{\mathcal {I}},{\gamma }})\), and that \(\textit{S}\) (for “specification”) is the intended set of judgments, called valid in the following.

Typically, we are interested in proving \(\textit{S}\subseteq \textit{G}\) (completeness, that is, each valid judgment can be derived) and/or \(\textit{G}\subseteq \textit{S}\) (soundness, that is, each derivable judgment is valid). Proving both properties amounts to say that the inference system with coaxioms actually defines the intended set of judgments.

In the following, set \(\beta =\nabla _{\textit{F}_{\mathcal {I}}}(\gamma )={\textit{Ind}}({\mathcal {I}_{{\sqcup }\gamma }})\).

Completeness Proofs. To show completeness, we can use the bounded coinduction principle. Indeed, since \({\textit{G}}= \varDelta _{\textit{F}_{\mathcal {I}}} (\beta )\), if \(\textit{S}\le \beta \) and \(\textit{S}\) is a post-fixed point of \({\textit{F}_{\mathcal {I}}}\), by \({(\text {C}\textsc {o}\text {I}\textsc {nd})}\) we get that \(\textit{S}\le \textit{G}\). That is, using the notations of inference systems, to prove completeness it is enough to show that:

  • \(\textit{S}\subseteq {\textit{Ind}}({\mathcal {I}_{{\sqcup }\gamma }})\)

  • \(\textit{S}\subseteq {\textit{F}_{\mathcal {I}}}(\textit{S})\)

We illustrate the technique on the inference system with coaxioms \(({{\mathcal {I}},{\gamma }})\) which defines the judgment \(\textit{allPos}(l,b)\) (page 7). Set \(\textit{S}^{\textit{allPos}}\) the set of pairs \(({{l},{b}})\) where b is T if all the elements in l are positive, F otherwise. Completeness means that the judgment \(\textit{allPos}(l,b)\) can be proved, for all \(({{l},{b}})\in \textit{S}^{\textit{allPos}}\). By the bounded coinduction principle, it is enough to show that

  • \(\textit{S}^{\textit{allPos}}\subseteq {\textit{Ind}}({\mathcal {I}_{{\sqcup }\gamma }})\)

  • \(\textit{S}^{\textit{allPos}}\subseteq {\textit{F}_{\mathcal {I}}}(\textit{S}^{\textit{allPos}})\)

To prove the first condition, we have to show that, for each \(({{l},{b}})\in \textit{S}^{\textit{allPos}}\), \(\textit{allPos}(l,b)\) has a finite proof tree in \({\mathcal {I}_{{\sqcup }\gamma }}\). This can be easily shown, indeed:

  • If l contains a (first) non-positive element, hence \(l=x_1:\ldots : x_n:x:l'\) with \(x_i>0\) for \(i\in [1..n]\), \(x \le 0\), and \(b=F\) then we can reason by arithmetic induction on n. Indeed, for \(n=0\), \(({{l},{b}})\) is the consequence of the second rule with no premises, and for \(n>0\) it is the consequence of the third rule where we can apply the inductive hypothesis to the premise.

  • If l contains only positive elements, hence \(b=T\), then \(({{l},{b}})\) is a coaxiom, hence it is the consequence of a rule with no premises in \({\mathcal {I}_{{\sqcup }\gamma }}\).

To prove the second condition, we have to show that, for each \(({{l},{b}})\in \textit{S}^{\textit{allPos}}\), \(\textit{allPos}(l,b)\) is the consequence of a rule with premises in \(\textit{S}^{\textit{allPos}}\). This can be easily shown, indeed:

  • If \(l=\varLambda \), hence \(b=T\), then \(\textit{allPos}(\varLambda ,T)\) is the consequence of the first rule with no premises.

  • If \(l=x:l'\) with \(x\le 0\), hence \(b=F\), then \(\textit{allPos}(l,F)\) is the consequence of the second rule with no premises.

  • If \(l=x:l'\) with \(x > 0\), and \(b=T\), hence \(({{l'},{T}})\in \textit{S}^{\textit{allPos}}\), then \(\textit{allPos}(l,T)\) is the consequence of the third rule with premise \(({{l'},{T}})\), and analogously if \(b=F\).

Soundness Proofs. To show soundness, it is convenient to use the alternative characterization in terms of approximated proof trees given in Sect. 4, as detailed below. First of all, from Proposition 6, \(\textit{G}\subseteq \bigcap \{\textit{F}^{n}_{\mathcal {I}}(\beta )\mid n \ge 0\}\). Hence, to prove \(\textit{G}\subseteq \textit{S}\), it is enough to show that \(\bigcap \{\textit{F}^{n}_{\mathcal {I}}(\beta )\mid n \ge 0\}\subseteq \textit{S}\). Moreover, by Lemma 1, for all \(n\in \mathbb {N}\), judgments in \(\textit{F}^{n}_{\mathcal {I}}(\beta )\) are those which have an approximated proof tree of level n. Hence, to prove set inclusion, we can show that all judgments which have an approximated proof tree of level n for each n are in \(\textit{S}\) or equivalently, by contraposition, that judgments which are not in \(\textit{S}\), that is, non-valid judgments, fail to have an approximated proof tree of level n for some n.

We illustrate the technique again on the example of allPos. We have to show that, for each \(({{l},{b}})\not \in \textit{S}^{\textit{allPos}}\), there exists \(n\ge 0\) such that \(({{l},{b}})\) cannot be proved by using coaxioms at level greater than n. By cases:

  • If l contains a (first) non-positive element, hence \(l=x_1:\ldots : x_n:x:l'\) with \(x_i>0\) for \(i\in [1..n]\), \(x \le 0\), then, assuming that coaxioms can only be used at a level greater than \(n+1\), \(({{l},{b}})\) can only be derived by instantiating n times the third rule, and once the second rule, hence b cannot be T.

  • If l contains only positive elements, then it is immediate to see that there is no finite proof tree for \(({{l},{F}})\).

6 Taming Coaxioms: Advanced Examples

Mutual Recursion. Circular definitions involving inductive and coinductive judgments have no semantics in standard inference systems, because all judgments have to be interpreted either inductively, or coinductively. The same problem arises in the context of coinductive logic programming [32], where a logic program has a well-defined semantics only if inductive and coinductive predicates can be stratified: each stratum defines only inductive or coinductive predicates (possibly defined in a mutually recursive way), and cannot depend on predicates defined in upper strata. Hence, it is possible to define the semantics of a logic program only if there are no mutually recursive definitions involving both inductive and coinductive predicates.

We have already seen that an inductive inference system corresponds to a generalized inference system with no coaxioms, while a coinductive one corresponds to a generalized one where coaxioms consist of all judgments in \({\mathcal {U}}\); however, between these two extremes, coaxioms offer many other possibilities thus allowing a finer control on the semantics of the defined judgments.

There exist cases where two or more related judgments need to be defined recursively, but for some of them the correct interpretation is inductive, while for others is coinductive [5, 31, 32]. In such cases, coaxioms may be employed to provide the correct definition in terms of a single inference system with no stratification, although special care is required to get from the inference system the intended meaning of judgments. To see this, let us consider the judgment \({ path0 }(t)\), where t is an infinite treeFootnote 4 over \(\{0,1\}\), which holds iff there exists a path starting from the root of t and containing just 0s; trees are represented as infinite terms of shape \({ tree }(n,l)\), where \(n\in \{0,1\}\) is the root of the tree, and l is the infinite list of its direct subtrees. For instance, if \(t_1\) and \(t_2\) are the trees defined by the syntactic equations

$$\begin{aligned} t_1={ tree }(0,l_1) \quad l_1=t_2{:}t_1{:}l_1 \quad t_2={ tree }(0,l_2) \quad l_2={ tree }(1,l_1){:}l_2 \end{aligned}$$

then we expect \({ path0 }(t_1)\) to hold, but not \({ path0 }(t_2)\).

To define \({ path0 }\), we introduce an auxiliary judgment \({ is }\_{ in0 }(l)\) testing whether an infinite list l of trees contains a tree t such that \({ path0 }(t)\) holds. Intuitively, we expect \({ path0 }\) and \({ is }\_{ in0 }\) to be interpreted coinductively and inductively, respectively; this reflects the fact that \({ path0 }\) checks a property universally quantified over an infinite sequence (a safety property in the terminology of concurrent systems): all the elements of the path must equal 0; on the contrary, \({ is }\_{ in0 }\) checks a property existentially quantified over an infinite sequence (a liveness property in the terminology of concurrent systems): the list must contain a tree t with a specific property (that is, \({ path0 }(t)\) must hold). Driven by this intuition, one could be tempted to define the following inference system with coaxioms for all judgments of shape \({ path0 }(t)\), and no coaxioms for judgments of shape \({ is }\_{ in0 }(l)\):

$$ \displaystyle \frac{{ is }\_{ in0 }(l)}{{ path0 }({ tree }(0,l))}{}\quad \displaystyle \frac{\bullet }{{ path0 }(t)}\quad \displaystyle \frac{{ path0 }(t)}{{ is }\_{ in0 }(t{:}l)}\quad \displaystyle \frac{{ is }\_{ in0 }(l)}{{ is }\_{ in0 }(t{:}l)} $$

Unfortunately, because of the mutual recursion between \({ is }\_{ in0 }\) and \({ path0 }\), the inference system above does not capture the intended behavior: \({ is }\_{ in0 }(l)\) is derivable for every infinite list of trees l, even when l does not contain a tree t with an infinite path starting from its root and containing just 0s.

To overcome this problem, we replace the judgment \({ is }\_{ in0 }\) with the more general one \({ is }\_{ in }\), such that \({ is }\_{ in }(t,l)\) holds iff the infinite list l contains the tree t. Consequently, we can define the following generalized inference system:

$$ \displaystyle \frac{{ is }\_{ in }(t,l)\quad { path0 }(t)}{{ path0 }({ tree }(0,l))}{}\quad \displaystyle \frac{\bullet }{{ path0 }(t)}\quad \displaystyle \frac{}{{ is }\_{ in }(t,t{:}l)}\quad \displaystyle \frac{{ is }\_{ in }(t,l)}{{ is }\_{ in }(t,t'{:}l)} $$

Now the semantics of the system corresponds to the intended one, and we do not need to stratify the definitions into two separate inference systems.

Following the characterization in terms of proof trees and the proof techniques provided in Sects. 4 and 5, we can sketch a proof of correctness. Let \(\textit{S}\) be the set where elements have either shape \({ path0 }(t)\), where t represents a tree with an infinite path of just 0s starting from its root, or \({ is }\_{ in }(t,l)\), where l represents an infinite list containing the tree t; then a judgment belongs to \(\textit{S}\) iff it can be derived in the generalized inference system defined above.

Completeness: We first show that the set \(\textit{S}\) is a post-fixed point, that is, it is consistent w.r.t. the inference rules, coaxioms excluded. Indeed, if t has an infinite path of 0s, then it has necessarily shape \({ tree }(0,l)\), where l must contain a tree \(t'\) with an infinite path of 0s. Hence, the inference rule for \({ path0 }\) can be applied with premises \({ is }\_{ in }(t',l)\in {\textit{S}}\), and \({ path0 }(t')\in {\textit{S}}\). If an infinite list contains a tree t, then it has necessarily shape \(t'{:}l\) where, either \(t=t'\), and hence the axiom for \({ is }\_{ in }\) can be applied, or \(t\ne t'\) and t is contained in l, and hence the inference rule for \({ is }\_{ in }\) can be applied with premise \({ is }\_{ in }(t,l)\in {\textit{S}}\).

We then show that \({\textit{S}}\) is bounded by the closure of the coaxioms. For the elements of shape \({ path0 }(t)\) it suffices to directly apply the corresponding coaxiom; for the elements of shape \({ is }\_{ in }(t,l)\) we can show that there exists a finite proof tree built possibly also with the coaxioms by induction on the first position (where the head of the list corresponds to 0) in the list where t occurs. If the position is 0 (base case), then \(l=t{:}l'\), and the axiom can be applied; if the position is \(n>0\) (inductive step), then \(l=t'{:}l'\) and t occurs in \(l'\) at position \(n-1\), therefore, by inductive hypothesis, there exists a finite proof tree for \({ is }\_{ in }(t,l')\), therefore we can build a finite proof tree for \({ is }\_{ in }(t,l)\) by applying the inference rule for \({ is }\_{ in }\).

Soundness: We first observe that the only finite proof trees that can be derived for \({ is }\_{ in }(t,l)\) are obtained by application of the axiom for \({ is }\_{ in }\), hence \({ is }\_{ in }(t,l)\) holds iff there exists a finite proof tree for \({ is }\_{ in }(t, l)\) built with the inference rules for \({ is }\_{ in }\). Then, we can prove that, if \({ is }\_{ in }(t,l)\) holds, then t is contained in l by induction on the inference rules for \({ is }\_{ in }\). For the axiom (base case) the claim trivially holds, while for the other inference rule we have that if t belongs to l, then trivially t belongs to \(t'{:}l\).

For the elements of shape \({ path0 }(t)\) we first observe that by the coaxioms they all trivially belong to the closure of the coaxioms. Then, any proof tree for \({ path0 }(t)\) must be infinite, because there are no axioms but only one inference rule for \({ path0 }\) where \({ path0 }\) is referred in the premises; furthermore, such a rule is applicable only if the root of the tree is 0. We have already proved that if \({ is }\_{ in }(t,l)\) is derivable, then t belongs to l, therefore we can conclude that if \({ path0 }(t)\) is derivable, then t contains an infinite path starting from its root, and containing just 0s.

A Numerical Example. It is well known that real numbers in the closed interval [0,1] can be represented by infinite sequences \((d_i)_{i\in \mathbb {N}^+}\) of decimalFootnote 5 digits, where \(\mathbb {N}^+\) denotes the set of all positive natural numbers. Indeed, \((d_i)_{i\in \mathbb {N}^+}\) represents the real number which is the limit of the series \(\sum _{i=1}^{\infty }10^{-i}d_i\) in the standard complete metric space of real numbers (such a limit always exists by completeness, because the associated sequence of partial sums is always a Cauchy sequence). Such a representation is not unique for all rational numbers in [0,1] (except for the bounds 0 and 1) that can be represented by a finite sequence of digits followed by an infinite sequence of 0s; for instance, 0.42 can be represented either by the sequence \(42\bar{0}\), or by the sequence \(41\bar{9}\), where \(\bar{d}\) denotes the infinite sequence containing just the digit d.

For brevity, for \(r=(d_i)_{i\in \mathbb {N}^+}\), \(\left[ \!\left[ {r\,}\!\right] \!\right] \) denotes \(\sum _{i=1}^{\infty }10^{-i}d_i\) (that is, the real number represented by r). We want to define the judgment \({ add }(r_1,r_2,r,c)\) which holds iff \(\left[ \!\left[ {r_1\,}\!\right] \!\right] +\!\left[ \!\left[ {r_2\,}\!\right] \!\right] =\!\left[ \!\left[ {r\,}\!\right] \!\right] +c\) with c an integer number; that is, \({ add }(r_1,r_2,r,c)\) holds iff the addition of the two real numbers represented by the sequences \(r_1\) and \(r_2\) yields the real number represented by the sequence r with carry c. We will soon discover that, to get a complete definition for \({ add }\), c is required to range over a proper superset of the set \(\{0,1\}\), differently from what one could initially expect.

We can define the judgment \({ add }\) with the following generalized inference system, where \(\div \) and \(\bmod \) denote the integer division, and the remainder operator, respectively.

$$ \begin{array}{c} \displaystyle \frac{{ add }(r_1,r_2,{r,c})}{{ add }(d_1{:}r_1,d_2{:}r_2,(s \bmod 10){:}r,s \div 10)}{s=d_1+d_2+c}\\[3ex] \displaystyle \frac{\bullet }{{ add }(r_1,r_2,r,c)} \end{array} $$

A real number in [0,1] is represented by an infinite list of decimal digits, which, therefore, can always be decomposed as d : r, where d is the first digit (corresponding to the exponent \(-1\)), and r is the rest of the list of digits. Here, \(r_1\), \(r_2\), and r range over the set of infinite lists of decimal digits, while the carry must range over \(\{-1,0,1,2\}\) to support a complete definition. As clearly emerges from the proof of completeness provided below, besides the obvious values 0 and 1, the values \(-1\) and 2 have to be considered for the carry to ensure a complete definition of \({ add }\) because both \({ add }(\bar{0},\bar{0},\bar{9},-1)\) and \({ add }(\bar{9},\bar{9},\bar{0},2)\) hold, and, hence, should be derivable; these two judgments allow the derivation of an infinite number of other valid judgments, as, for instance, \({ add }(1\bar{0},1\bar{0},1\bar{9},0)\) and \({ add }(1\bar{9},1\bar{9},4\bar{0},0)\), respectively.

Also in this case we can sketch a proof of correctness: for all infinite sequences of decimal digits \(r_1\), \(r_2\) and r, and all \(c\in \{-1,0,1,2\}\), \({ add }(r_1,r_2,r,c)\) is derivable iff \(\left[ \!\left[ {r_1\,}\!\right] \!\right] +\!\left[ \!\left[ {r_2\,}\!\right] \!\right] =\!\left[ \!\left[ {r\,}\!\right] \!\right] +c\).

Completeness: By the coaxioms we trivially have that each element \({ add }(r_1,r_2,r,c)\) such that \(\left[ \!\left[ {r_1\,}\!\right] \!\right] +\!\left[ \!\left[ {r_2\,}\!\right] \!\right] =\!\left[ \!\left[ {r\,}\!\right] \!\right] +c\) with \(c\in \{-1,0,1,2\}\) belongs to the closure of the coaxioms.

To show that the unique inference rule of the system is consistent with the set of all correct judgments, let us assume that \(\left[ \!\left[ {r'_1\,}\!\right] \!\right] +\!\left[ \!\left[ {r'_2\,}\!\right] \!\right] =\!\left[ \!\left[ {r'\,}\!\right] \!\right] +c'\) with \(r'_1=d_1{:}r_1\), \(r'_2=d_2{:}r_2\), \(r'=d{:}r\) and \(c'\in \{-1,0,1,2\}\). Let us set \(s=10c'+d\), and \(c=s-d_1-d_2\), then \(s\bmod 10=d\) and \(s\div 10=c'\), and we get the desired conclusion of the inference rule, and the side condition holds; it remains to show that \(\left[ \!\left[ {r_1\,}\!\right] \!\right] +\!\left[ \!\left[ {r_2\,}\!\right] \!\right] =\!\left[ \!\left[ {r\,}\!\right] \!\right] +c\) with \(c\in \{-1,0,1,2\}\).

We first observe that by the properties of limits w.r.t. the usual arithmetic operations, and by definition of \(\left[ \!\left[ {-\,}\!\right] \!\right] \), for all infinite sequence r of decimal digits, if \(r=d{:}r'\), then \(\left[ \!\left[ {r\,}\!\right] \!\right] =10^{-1}(d+\!\left[ \!\left[ {r'\,}\!\right] \!\right] )\); then, from the hypotheses we get the equality \(d_1+\!\left[ \!\left[ {r_1\,}\!\right] \!\right] +d_2+\!\left[ \!\left[ {r_2\,}\!\right] \!\right] =d+\!\left[ \!\left[ {r\,}\!\right] \!\right] +10c'\), and, therefore, \(\left[ \!\left[ {r_1\,}\!\right] \!\right] +\!\left[ \!\left[ {r_2\,}\!\right] \!\right] =\!\left[ \!\left[ {r\,}\!\right] \!\right] +c\); finally, since \(c=\!\left[ \!\left[ {r_1\,}\!\right] \!\right] +\!\left[ \!\left[ {r_2\,}\!\right] \!\right] -\!\left[ \!\left[ {r\,}\!\right] \!\right] \), and \(0\le \!\left[ \!\left[ {r_1\,}\!\right] \!\right] ,\!\left[ \!\left[ {r_2\,}\!\right] \!\right] ,\!\left[ \!\left[ {r\,}\!\right] \!\right] \le 1\), we get \(c\in \{-1,0,1,2\}\).

Soundness: Let \(r'_1=d_1{:}r_1\), \(r'_2=d_2{:}r_2\), and \(r'=d{:}r\) be infinite sequences of decimal digits, and \(c'\in \{-1,0,1,2\}\); we observe that the judgment \({ add }(r'_1,r'_2,r',c')\) can be derived from the unique inference rule only with the premise \({ add }(r_1,r_2,r,c)\) where c must equal \(10c'+d-d_1-d_2\) and must range over \(\{-1,0,1,2\}\).

To prove soundness we show that if \({\!\left[ \!\left[ {r'_1\,}\!\right] \!\right] }+\!\left[ \!\left[ {r'_2\,}\!\right] \!\right] \ne \!\left[ \!\left[ {r'\,}\!\right] \!\right] +c'\), then \({ add }(r'_1,r'_2,r',c')\) cannot be derived in the inference system. Let us set \(\delta '=|\!\left[ \!\left[ {r'\,}\!\right] \!\right] +c'-\!\left[ \!\left[ {r'_1\,}\!\right] \!\right] -\!\left[ \!\left[ {r'_2\,}\!\right] \!\right] |\); obviously, under the hypothesis \(\left[ \!\left[ {r'_1\,}\!\right] \!\right] +\!\left[ \!\left[ {r'_2\,}\!\right] \!\right] \ne \!\left[ \!\left[ {r'\,}\!\right] \!\right] +c'\), we get \(\delta ' >0\). In particular, the following fact holds: if \(\delta '\ge 4\cdot 10^{-1}\), then \(10c'+d-d_1-d_2\not \in \{-1,0,1,2\}\). Indeed, by the identity \(\left[ \!\left[ {r\,}\!\right] \!\right] =10^{-1}(d+\!\left[ \!\left[ {r'\,}\!\right] \!\right] )\) already used for the proof of completeness, we have that \(\delta '=10^{-1}|\!\left[ \!\left[ {r\,}\!\right] \!\right] +c-\!\left[ \!\left[ {r_1\,}\!\right] \!\right] -\!\left[ \!\left[ {r_2\,}\!\right] \!\right] |\), with \(c=10c'+d-d_1-d_2\); \(10^{-1}(\!\left[ \!\left[ {r\,}\!\right] \!\right] +c-\!\left[ \!\left[ {r_1\,}\!\right] \!\right] -\!\left[ \!\left[ {r_2\,}\!\right] \!\right] ) \ge 4\cdot 10^{-1}\) implies \(c\ge 3\) (\(\left[ \!\left[ {r_1\,}\!\right] \!\right] ,\!\left[ \!\left[ {r_2\,}\!\right] \!\right] ,\!\left[ \!\left[ {r\,}\!\right] \!\right] \in [0,1]\)), and, hence, \(c=10c'+d-d_1-d_2\not \in \{-1,0,1,2\}\). By duality, \(10^{-1}(\!\left[ \!\left[ {r\,}\!\right] \!\right] +c-\!\left[ \!\left[ {r_1\,}\!\right] \!\right] -\!\left[ \!\left[ {r_2\,}\!\right] \!\right] )\le -4\cdot 10^{-1}\) implies \(c\le -2\), hence \(c=10c'+d-d_1-d_2\not \in \{-1,0,1,2\}\).

By virtue of this fact, and thanks to the hypotheses, we can prove by arithmetic induction over n that for all \(n\ge 1\), if \(\delta '\ge 4\cdot 10^{-n}\), then there exists a finite proof tree for \({ add }(r'_1,r'_2,r',c')\) where the coaxioms are applied at most at depth \(n-1\). The base case is directly derived from the previously proven fact. For the inductive step we observe that if the inference rule is applicable for deriving the conclusion \({ add }(r'_1,r'_2,r',c')\), then we can apply the inductive hypothesis for the premise \({ add }(r_1,r_2,r,c)\) since we have already shown that \(\delta '=10^{-1}\delta \), therefore \(\delta \ge 4\cdot 10^{-(n-1)}\).

We can now conclude by observing that if \(\left[ \!\left[ {r'_1\,}\!\right] \!\right] +\!\left[ \!\left[ {r'_2\,}\!\right] \!\right] \ne \!\left[ \!\left[ {r'\,}\!\right] \!\right] +c'\), then there exists n such that \(\delta '\ge 4\cdot 10^{-n}\), therefore, by the previous result, we deduce that it is not possible to build a finite tree for \({ add }(r'_1,r'_2,r',c')\) where the coaxioms are applied at arbitrary depth k (in particular, k is bounded by \(n-1\)); therefore \({ add }(r'_1,r'_2,r',c')\) cannot be derived in the inference system.

From the proof of soundness we can also deduce that if we let c range over \(\mathbb {Z}\), then the inference system becomes unsound; for instance, \({ add }(\bar{0},\bar{0},\bar{0},1)\) would be derivable, but \(\left[ \!\left[ {\bar{0}\,}\!\right] \!\right] +\!\left[ \!\left[ {\bar{0}\,}\!\right] \!\right] \ne \!\left[ \!\left[ {\bar{0}\,}\!\right] \!\right] +1\):

$$ \dfrac{ \dfrac{ \vdots }{{ add }(\bar{0},\bar{0},\bar{0},10^1)} }{{ add }(\bar{0},\bar{0},\bar{0},10^0)} $$

Big-Step Operational Semantics with Divergence. It is well-known that divergence cannot be captured by the big-step operational semantics of a programming language when semantic rules are interpreted inductively (that is, in the standard way) [4, 6, 23]. When rules are interpreted coinductively some partial result can be obtained under suitable hypotheses, but a practical way to capture divergence with a big-step operational semantics is to introduce two different forms of judgment [14, 23]: one corresponds to the standard big-step evaluation relation, and is defined inductively, while the other one captures divergence, and is defined coinductively in terms of the inductive judgment, thus requiring stratification. Other approaches consist in considering coinductive trace-based big-step semantics [27], and flag-based big-step semantics [29].

Fig. 1.
figure 1

Call-by-value big-step semantics of \(\lambda \)-calculus with divergence

With coaxioms a unique judgment can be defined in a more direct and compact way. We showFootnote 6 how this is possible for the standard call-by-value operational semantics of the \(\lambda \)-calculus in big-step style. Figure 1 defines syntax, values, and semantic rules. The meta-variable v ranges over standard values, that is, lambda abstractions, while \(v_\infty \) includes also divergence, represented by \(\infty \). The evaluation judgment has the general shape \({e}\Rightarrow {v_\infty }\), meaning that either e evaluates to a value v (when \(v_\infty \ne \infty \)) or diverges (when \(v_\infty =\infty \)).

For what concerns the semantic rules, only a coaxiom is needed, stating that every expression may diverge. This ensures that \(\infty \) can be the only allowed outcome for the evaluation of an expression which diverges; this can only happen when the corresponding derivation tree is infinite. Rule (val) is standard. Rule (app) deals with the evaluation of application when both expressions \(e_1\) and \(e_2\) do not diverge; the meta-variable v is required for the judgment \({e_2}\Rightarrow {v}\) to guarantee convergence of \(e_2\), while \(v_\infty \) is used for the result of the whole application, since the evaluation of the body of the lambda abstraction could diverge. As usual, \({e}[{x}\leftarrow {v}]\) denotes capture-avoiding substitution modulo \(\alpha \)-renaming. Rules (l-inf) and (r-inf) cover the cases when either \(e_1\) or \(e_2\) diverges when trying to evaluate application, assuming that a left-to-right evaluation strategy has been imposed.

We show that the only judgment derivable for \(e_\varDelta =(\lambda x.x\ x) \lambda x.x\ x\) is \({e_\varDelta }\Rightarrow {\infty }\). To this aim, we first disregard the coaxiom and exhibit an infinite derivation tree for the judgment \({e_\varDelta }\Rightarrow {v_\infty }\), derivable for all \(v_\infty \):

$$ { \begin{array}{l} \displaystyle \text{(app) }\,\frac{ \displaystyle \text{(val) }\,\frac{}{{\lambda x.x\ x}\Rightarrow {\lambda x.x\ x}} \quad \displaystyle \text{(val) }\,\frac{}{{\lambda x.x\ x}\Rightarrow {\lambda x.x\ x}} \quad \displaystyle \text{(app) }\,\frac{\vdots }{{{(x\ x)}[{x}\leftarrow {\lambda x.x\ x}]}\Rightarrow {v_\infty }}}{{{(x\ x)}[{x}\leftarrow {\lambda x.x\ x}]=e_\varDelta }\Rightarrow {v_\infty }} \end{array} } $$

In this particular case the derivation tree is also regular, but of course there are examples of divergent computations whose derivation tree is not regular. The vertical dots indicate that the derivation continues with the same repeated pattern. The derivation corresponds to the coinductive interpretation of the standard big-step semantics rules [4, 23], which may exhibit non-deterministic behavior as happens for this example; however, here the coaxiom plays a crucial role by filtering out all undesired values, and, thus, leaving only the value \(\infty \) representing divergence; indeed, by employing also the coaxiom, finite derivation trees can be built for \({e_\varDelta }\Rightarrow {v_\infty }\) only when \(v_\infty =\infty \). By Corollary 3 we can get an infinite sequence of approximating derivation trees of arbitrarily increasing level:

$$ { \begin{array}{l} \displaystyle \text{(coax) }\,\frac{}{{e_\varDelta }\Rightarrow {\infty }} \\[4ex] \displaystyle \text{(app) }\,\frac{ \displaystyle \text{(val) }\,\frac{}{{\lambda x.x\ x}\Rightarrow {\lambda x.x\ x}} \quad \displaystyle \text{(val) }\,\frac{}{{\lambda x.x\ x}\Rightarrow {\lambda x.x\ x}} \quad \displaystyle \text{(coax) }\,\frac{}{{{(x\ x)}[{x}\leftarrow {\lambda x.x\ x}]}\Rightarrow {\infty }}}{{{(x\ x)}[{x}\leftarrow {\lambda x.x\ x}]=e_\varDelta }\Rightarrow {\infty }} \\[4ex] \vdots \end{array} } $$

As a consequence, in the inference system with the coaxiom a valid infinite derivation tree can be built for \({e_\varDelta }\Rightarrow {v_\infty }\) only when \(v_\infty =\infty \).

7 Related Work

Inference systems [1] are widely adopted to formally define operational semantics, language translations, type systems, subtyping relations, and other relevant judgments. Although inference systems have been introduced for dealing with inductive recursive definitions, in the last two decades several authors have focused on their coinductive interpretation.

Cousot and Cousot [14] define divergence by the coinductive interpretation of an inference system which extends the big-step operational semantics. The same approach is followed by other authors [16, 23, 30]. Leroy and Grall [23] analyze two kinds of coinductive big-step operational semantics for the call-by-value \(\lambda \)-calculus, and study their relationships with the small-step and denotational semantics, and their suitability for compiler correctness proofs. Coinductive big-step semantics is used as well to reason on cyclic objects stored in memory [24, 26], and to prove type soundness in Java-like languages [4, 6]. Coinductive inference systems are also considered in the context of type analysis and subtyping for object-oriented languages [7, 9].

More recently, several solutions have been proposed to extend existing programming languages to support corecursion, and are, therefore, more focused on operational aspects, and their corresponding implementation issues; contributions can be found for all main computational paradigms: logic [5, 21, 25], functional [18, 19], and object-oriented programming [10, 11].

For the logic paradigm, the starting point is coinductive logic programming (coLP) [32], an extension of logic programming which provides both a declarative and a sound but not complete operational semantics for coinductive predicates, the former based on the notion of complete Herbrand base (finite and infinite terms) and greatest fixed point. However, only the standard coinductive interpretation is supported, and mixing between inductive and coinductive predicates is only partially supported through stratification. Structural resolution [21] is an extension of the operational semantics for coLP not limited to regular derivations. Other proposals [5, 25] provide more flexible operational semantics. The notion of finally clause [5] has inspired our notion of coaxiom: finally clauses are user-defined facts that are resolved when an infinite, but regular, derivation is detected, in replacement of the standard coinductive semantics. Despite the existing strong correlation with this paper, the semantics of finally clauses does not always coincide with a fixed point of the one step inference operator. Similar considerations apply also to the work on coFJ [10, 11], where with clauses play a role similar to that of finally clauses for coLP. A first attempt to provide a denotational model for this language, overtaken by the present work, has been provided in [8].

CoCaml [18, 19] is an extension of OCaml where the semantics of recursive functions can be parametric in an equation solver which can be either pre-defined, or explicitly provided by the programmer to support corecursion. The intuition suggests that choosing a solver corresponds to choose a specific partial order, in such a way that the desired function is a fixed point in the corresponding CPO. Among the several proposed solvers, the pre-defined iterator solver has an expressive power similar to that of the finally and with clauses mentioned above.

As already mentioned, the spirit of our work is very different from that on CoCaml, since we do not aim to extend a practical language with corecursion, but, rather, to provide a very general framework which smoothly extends the well-known notion of inference system, and that could be used in many useful contexts, as shown in Sect. 6. On the other hand, definitions of higher order functions cannot be directly supported by inference systems. The foundation of CoCaml [20] is based on the theory of recursion in the framework of coalgebras. Our approach, instead, relies on the standard complete lattice of powersets, with set inclusion as partial order. In this way, a single and simple model based on classical results works uniformly for any possible recursive definition expressed in terms of a generalized inference system.

Recursive and well-founded coalgebras [12] are a framework for generalized structural recursion.

Completely iterative algebras [2] and corecursive and anti-founded algebras [12] are frameworks for generalized structural corecursion; iterative algebras correspond to the rational case as opposed to the coinductive one.

In guarded recursion [13, 28] a judgment can be proved by also using recursively the judgment itself, provided that such recursive call is guarded by introduction rules. The goal, similar to ours, is to obtain a unique fixed point, however, there is no counterpart of the guard notion in the general framework of inference systems.

8 Conclusion

We have presented a generalized notion of inference system by introducing coaxioms, to support flexible definitions of judgments by structural recursion on non-well-founded datatypes.

Consequently, we have generalized the meta-theory of inference systems by providing two equivalent semantics, one based on fixed points in a complete lattice, and the other on the notion of proof tree. In the former case, the semantics of an inference system is the greatest fixed point of its corresponding one step inference operator, below the least pre-fixed point containing the coaxioms; in the latter case, the standard notion of proof tree for the coinductive case is generalized by requiring coaxioms to be applicable “at an infinite depth”.

We have provided proof techniques for proving soundness and completeness results and shown their application to a range of different examples.

A compelling direction for further developments is exploring mechanization in proof assistants and other proof techniques [17] for coaxioms.

A necessarily not complete prototype meta-interpreter has been implemented in SWI-PrologFootnote 7 to test the examples provided in Sects. 2 and 6. SWI-Prolog offers a natural support to regular terms (a.k.a. cyclic terms) through unification, but examples involving non-regular terms (or derivations) cannot terminate.

Extending the notion of coaxiom in the setting of object-oriented and functional programming is more challenging, because of the gap between the underlying theories.

We plan to investigate the dual notion studied here: one could consider the least fixed point above the greatest post-fixed point contained in the coaxioms, instead of the greatest fixed point below the least pre-fixed point containing the coaxioms. In particular, it would be interesting studying inference systems for which the two different semantics coincide.