Keywords

1 Introduction

Barendregt’s genericity lemma [14, Prop. 14.3.24] is a classic result in the theory of the untyped \(\lambda \)-calculus. It expresses the fact that meaningless terms—also called unsolvable terms, a notion generalizing the bad behaviour of the paradigmatic looping term \(\varOmega {:}{=}(\lambda x.xx)(\lambda x.xx)\)—are sort of black holes for evaluation: if evaluation should ever enter them, it would never get out. This is specified somewhat dually, saying that if a term \(t\) containing a meaningless term \(u\) evaluates to a normal form, that is, if \(t\) is observable, then replacing \(u\) with any other term in \(t\) gives a term \(t'\) that is also observable. Roughly, if one can observe a term containing a black hole then evaluation never enters the black hole.

Genericity is arguably more than a lemma, but it is so labeled because its main use is as a tool in Barendregt’s proofs of collapsibility of meaningless terms, that is, the fact that the equational theory \(\mathcal {H}\) equating all meaningless terms is consistent, i.e. it does not equate all terms. Such collapsibility is one of the cornerstones of the semantics of the untyped \(\lambda \)-calculus.

Recap about Meaningless Terms. Meaningless terms were first studied in the 1970s, by Wadsworth [45, 46] and Barendregt [12, 13], while working on denotational models and the representation of partial recursive functions (PRFs). The starting point is that the natural choice of representing the being undefined of PRFs—considered as the paradigmatic meaningless computation—with terms not having a normal form leads to a problematic representation of PRFs. The issue is visible also at the equational level, as all theories collapsing all diverging terms are inconsistent. Wadsworth and Barendregt then identify the class of unsolvable terms as a better notion of meaningless terms: the representation of PRFs using them as undefined terms is better behaved, they are collapsible, and in particular they are identified in Scott’s first \(D_\infty \) model of the untyped \(\lambda \)-calculus.

Unsolvable terms are defined via a contextual property, but they are also characterized as being diverging for head \(\beta \)-reduction \(\rightarrow _{\texttt{h}}\), rather than plain \(\beta \)-reduction \(\rightarrow _{\beta }\). The dual notion of solvable terms, which are terminating for head reduction, are taken as the right notion of defined term, replacing the natural but misleading idea that \(\beta \)-normal forms are the right notion of defined term.

Barendregt classic book from the 1980s [14] is built around the concept of (un)solvability. Visser and Statman noted that (un)solvability is not the only partition of terms providing good representations of PRFs and being collapsible, as summarized by Barendregt [15]. Typically, (in)scrutable terms, first studied by Paolini and Ronchi della Rocca [36, 38, 41] (under the name (non-)potentially valuable terms), provide an alternative good partition. In call-by-name (CbN), (in)scrutable terms correspond to weak head normalizing/diverging terms.

This Paper. The work presented here stems from the desire to obtain genericity for the untyped call-by-value \(\lambda \)-calculus. Perhaps surprisingly, the call-by-value (shortened to CbV) \(\lambda \)-calculus behaves quite differently with respect to meaningless terms. Accattoli and Guerrieri’s recent study of meaningless terms in CbV [6] indeed stresses two key differences: genericity fails in CbV, and collapsibility fails as well, as any equational theory equating CbV meaningless terms is inconsistent, if one considers as meaningless the CbV analogous of unsolvable terms. Accattoli and Guerrieri also show that collapsibility can be recovered by adopting a different notion of meaningless terms, namely CbV inscrutable terms, but they do not prove genericity for them.

In this paper, we do prove a genericity result for inscrutable terms, and also provide a new proof of their collapsibility. These results, however, are only a small part of the contributions of this paper.

Contribution 1: the Very Statement of Genericity. We start by focussing on the statement of genericity. The literature contains various versions. The one used by Barendregt for proving collapsibility is the following (where unsolvable terms are identified with \(\rightarrow _{\texttt{h}}\)-diverging terms), here dubbed as heavy:

$$\begin{aligned} \begin{array}{c} {\boldsymbol{Heavy\,genericity}}{} \textit{: let }u\textit{ be }\rightarrow _{\texttt{h}}\textit{-diverging and }C\textit{ be a context such that }\\ C\langle u\rangle \rightarrow _{\beta }^* n\textit{ with }n\beta \textit{-normal. Then, }C\langle t\rangle \rightarrow _{\beta }^* n\textit{ for all }t\textit{.} \end{array} \end{aligned}$$

In Takahashi’s elegant proof of heavy genericity [44]—which is an inspiration for our work—the following statement is called fundamental property of unsolvable \(\lambda \)-terms, which we here consider as an alternative, lighter statement for genericity:

$$\begin{aligned} \begin{array}{c} {\boldsymbol{Light\,genericity}}{} \textit{: let }u\textit{ be }\rightarrow _{\texttt{h}}\textit{-diverging and }C\textit{ be a context such that } C\langle u\rangle \textit{ is }\\ \rightarrow _{\texttt{h}}\textit{-normalizing. Then, }C\langle t\rangle \textit{ is }\rightarrow _{\texttt{h}}\textit{-normalizing for all }t\textit{.} \end{array} \end{aligned}$$

We adopt the lighter statement as the proper one for genericity for three reasons:

  1. 1.

    Powerful enough. We show that the collapsibility of unsolvable terms follows already from the light notion: there is no need to consider reductions to \(\beta \)-normal form, nor the fact that the normal forms of \(C\langle u\rangle \) and \(C\langle t\rangle \) coincide.

  2. 2.

    Economical and natural. The light version involves less concepts and it is more in line with the motivations behind (un)solvability: if the right notion of defined terms is head normalizable terms, it is somewhat odd to state genericity with respect to \(\beta \)-normal forms.

  3. 3.

    Modularity. In CbV, it is less clear what notion of normal form to use for the heavy statement, as shall be explained below. The light version, instead, adapts naturally. It is also impossible to have a heavy form of the co-genericity property given below, since the involved terms have no (full) normal form.

We then adapt Takahashi’s proof of heavy genericity to the light case.

Contribution 2: (Open) Contextual Equivalence/Pre-Order. Once one adopts the light statement, a connection with contextual equivalence becomes evident. Precisely, consider the contextual pre-order (that is, the asymmetric variant of contextual equivalence) induced by head reduction:

$$\begin{aligned} \begin{array}{c} {\boldsymbol{Head\,contextual\,pre}}{} {\textbf {-}}{\boldsymbol{order}}: t\precsim _{\mathcal {C}}^\texttt{h}u\textit{ if }C\langle t\rangle \rightarrow _{\texttt{h}}\textit{-normalizing implies }C\langle u\rangle \\ \rightarrow _{\texttt{h}}\textit{-normalizing, for all closing contexts }C\textit{.} \end{array} \end{aligned}$$

Light genericity seems to rephrase that \(\rightarrow _{\texttt{h}}\)-diverging terms are minimum terms with respect to \(\precsim _{\mathcal {C}}^\texttt{h}\). There is however a small yet non-trivial glitch: contextual pre-orders/equivalences are defined using closing contexts, while genericity is defined using arbitrary, possibly open contexts. Is the closing requirement essential in the definition of contextual notions? To our knowledge, this question has not been addressed in the literature. In fact, there is no absolute answer for all cases, as it depends on the notion of observation and on the underlying calculus.

We show that, for head reduction, open and closed contextual notions do coincide, what we refer to as the fact that head reduction is openable. As it is often the case with behavioral notions, proving head reduction openable cannot be done by simply unfolding the definitions, and requires some work.

The proof that we provide is—we believe—particularly elegant. It is obtained as the corollary of a further contribution, the revisitation of another classic result from the theory of the untyped \(\lambda \)-calculus, described next.

Contribution 3: Maximality. Barendregt proves that open head contextual equivalence—what he denotes as the equational theory \(\mathcal {H}^*\)—is maximal among consistent equational theories, i.e. any extension of \(\mathcal {H}^*\) is inconsistent (moreover, \(\mathcal {H}^*\) is the unique maximum theory among those collapsing unsolvable terms). His proof uses Böhm theorem, an important and yet non-trivial result. We give a new proof based only on light genericity, which is an arguably simpler result than Böhm theorem, obtained adapting a similar result for CbV by Egidi et al. [19].

Contribution 4: Call-by-Value. Finally, we study the CbV case, adopting inscrutable terms as notion of meaningless terms. In Plotkin’s original CbV calculus [40], however, these terms cannot be characterized as diverging for some strategy. Moreover, in Plotkin’s calculus evaluation is not openable, that is, open and closed contextual notions do not coincide. In both cases, the issue is connected to the management of open terms.

We then adopt Accattoli and Paolini’s value substitution calculus (VSC) [9], which is an extension of Plotkin’s calculus solving its well-known issues with open terms and having the same (closed) contextual equivalence. Therein, inscrutable terms are characterized as those diverging for weak evaluation \(\rightarrow _{\texttt{w}}\).

For the VSC, we prove light genericity for \(\rightarrow _{\texttt{w}}\)-diverging terms. We use a different technique with respect to the CbN case, namely we rely on Ehrhard’s CbV multi types [20] (multi types are also known as non-idempotent intersection types), because Takahashi’s technique does not easily adapt to the CbV case. We also give a proof of maximality (essentially Egidi et al. [19]’s argument used as blueprint for the CbN case) from which it follows that evaluation in the VSC is openable, in contrast with evaluation in Plotkin’s calculus.

As hinted at above, it is relevant that in CbV we study light genericity rather than the heavy variant because the notion of full normal form in the CbV case is less standard. Firstly, it differs between Plotkin’s calculus and the VSC. Secondly, it also differs between various refinements of Plotkin’s calculus that can properly manage open terms, as discussed by Accattoli et al. [7].

Contribution 5: Co-Genericity. A difference between the head CbN case and weak CbV case is given by an interesting class of terms, those evaluating to an infinite sequence of abstractions, that is, such that \(t\rightarrow _{\beta }^* \lambda x.t'\) with \(t'\) having the same property. Such terms are \(\rightarrow _{\texttt{h}}\)-diverging (thus head CbN meaningless), but \(\rightarrow _{\texttt{w}}\)-normalizing (CbV meaningful), and hereditarily so. We prove that these \(\rightarrow _{\texttt{w}}\)-super (normalizing) terms are maximum elements of the CbV contextual pre-order, and the statement of this fact is a new notion of co-genericity:

$$\begin{aligned} \begin{array}{c} {\boldsymbol{Co-Genericity}}{} \textit{: let }t\textit{ be }\rightarrow _{\texttt{w}}\textit{-super and }C\textit{ be a context such that }C\langle u\rangle \textit{ is}\\ \rightarrow _{\texttt{w}}\textit{-normalizing for some }u\textit{. Then, }C\langle t\rangle \textit{ is }\rightarrow _{\texttt{w}}\textit{-normalizing.} \end{array} \end{aligned}$$

We then show a strengthened collapsibility result: all \(\rightarrow _{\texttt{w}}\)-diverging terms and all \(\rightarrow _{\texttt{w}}\)-super terms can be consistently collapsed.

Contribution 6: Alternative Proofs via Applicative Bisimilarity. Lastly, we show a different route to proving light genericity and co-genericity—in CbV, but the technique is general—by exploiting the link with contextual pre-orders. Namely, we give a second proof that weak CbV evaluation is openable in the VSC without using light genericity, and then we use the soundness of CbV applicative bisimilarity with respect to the (closed) contextual pre-order for giving very simple proofs of light genericity and co-genericity.

Related Work. There are many proofs of CbN genericity in the literature (but they do not all prove the same statementFootnote 1): a topological one by Barendregt [14, Prop. 14.3.24], via intersection types by Ghilezan [24], rewriting-based ones by Takahashi [44], Kuper [30], Kennaway et al. [28], and Endrullis and de Vrijer [21], and via Taylor expansion by Barbarossa and Manzonetto [11]. Salibra studies a generalization to an infinitary \(\lambda \)-calculus [43]. García-Pérez and Nogueira prove partial genericity for Plotkin’s CbV \(\lambda \)-calculus [23] using a different notion of meaningless terms, not as well-behaved as CbV inscrutable terms.

The most famous application of genericity is the collapsibility of meaningless terms. Another application is Folkerts’s invertibility of terms for \(\lambda \eta \) [22].

Independently, Arrial, Guerrieri, and Kesner developed an alternative study of genericity in both CbN and CbV [10].

Proofs. Most proofs are omitted and can be found in the technical report [8].

2 Preliminaries

In this paper, we consider two languages, the \(\lambda \)-calculus and the value substitution calculus. Here we give abstract definitions that apply to both. We then refer to a generic language \(\mathcal {L}\) of host reduction \(\rightarrow _{\mathcal {L}}\subseteq \mathcal {L}\times \mathcal {L}\) together with an evaluation strategy discussed below. Terms of both languages are considered modulo \(\alpha \)-renaming. Capture-avoiding substitution is noted \(t\{x{\leftarrow }u\}\).

Evaluation Strategies. An evaluation strategy for us is a relation \(\rightarrow _\texttt{s}\subseteq \rightarrow _{\mathcal {L}}\) which is either deterministic or has the diamond property, which, according to Dal Lago and Martini [18], is defined as follows: a relation \(\rightarrow _{\textsf{r}}\) is diamond if \(u_1 \,{}_\textsf{r}\!\!\; _{}{\leftarrow }\ t\rightarrow _{\textsf{r}} u_2\) and \(u_1 \ne u_2\) imply \(u_1 \rightarrow _{\textsf{r}} s\, {}_\textsf{r}\!\!\; _{}{\leftarrow }\ u_2\) for some \(s\). If \(\rightarrow _{\textsf{r}}\) is diamond then it is confluent, all its reductions to normal form (if any) have the same length, and if there is one such reduction from \(t\) then there are no diverging reductions from \(t\); essentially, the diamond property is a weakened notion of determinism.

We refer to a generic evaluation strategy with \(\rightarrow _\texttt{s}\) or simply with \(\texttt{s}\), and we also simply call it a strategy, and usually we omit the underlying language. The conversion relation \(=_\texttt{s}\) associated to a strategy \(\texttt{s}\) is the smallest equivalence relation containing \(\rightarrow _\texttt{s}\). We say that \(t\) is:

  • \(\texttt{s}\)-normal: if \(t\not \rightarrow _\texttt{s}\);

  • \(\texttt{s}\)-normalizing: if there exists \(u\) such that \(t\rightarrow _\texttt{s}^* u\) and \(u\) is \(\texttt{s}\)-normal;

  • \(\texttt{s}\)-diverging: if \(t\) is not \(\texttt{s}\)-normalizing.

We say that \(\texttt{s}\) is:

  • Consistent: if there exist two closed terms \(t\) and \(u\) such that \(t\) is \(\texttt{s}\)-normalizing and \(u\) is \(\texttt{s}\)-diverging;

  • Normalizing: if \(t\rightarrow _{\mathcal {L}}^*u\) with \(u\) \(\texttt{s}\)-normal implies that \(t\) is \(\texttt{s}\)-normalizing;

  • Stabilizing: if \(t\) \(\texttt{s}\)-normal and \(t\rightarrow _{\mathcal {L}}^* u\) imply \(u\) \(\texttt{s}\)-normal;

  • Weak: if there are no \(\texttt{s}\)-redexes under abstraction.

Contexts. An essential tool in our study shall be contexts, which are terms where a sub-term has been replaced by a hole \(\langle \cdot \rangle \). For instance, for the \(\lambda \)-calculus they are defined as follows: \(C,C':\,\!:=\langle \cdot \rangle \mid tC\mid Ct\mid \lambda x.C\). The basic operation on contexts is the plugging \(C\langle t\rangle \) of a term \(t\) in \(C\), which simply replaces \(\langle \cdot \rangle \) with \(t\) in \(C\), possibly capturing variables. For instance, \((\lambda x.\langle \cdot \rangle )\langle xy\rangle =\lambda x.xy\). Note that plugging cannot be expressed as capture-avoiding substitution since \((\lambda x.z)\{z{\leftarrow }xy\} = \lambda x'.xy\ne \lambda x.xy\).

Contextual Equivalences and Pre-Orders. The standard of reference for program equivalences is contextual equivalence. The following definition slightly generalizes the standard one as to catch also the open case studied in this paper.

Definition 1

(Open and Closed Contextual Pre-Order and Equivalence). Given an evaluation strategy \(\texttt{s}\), we define the open contextual pre-order \(\precsim _{\mathcal {C}\mathcal {O}}^{\texttt{s}}\) and open contextual equivalence \(\simeq _{\mathcal {C}\mathcal {O}}^{\texttt{s}}\) as follows:

  • \(t\precsim _{\mathcal {C}\mathcal {O}}^{\texttt{s}} t'\) if, for all contexts \(C\), \(C\langle t\rangle \) is \(\texttt{s}\)-normalizing implies that \(C\langle t'\rangle \) is \(\texttt{s}\)-normalizing;

  • \(t\simeq _{\mathcal {C}\mathcal {O}}^{\texttt{s}} t'\) is the equivalence relation induced by \(\precsim _{\mathcal {C}\mathcal {O}}^{\texttt{s}}\), that is, \(t\simeq _{\mathcal {C}\mathcal {O}}^{\texttt{s}} t'\) if \(t\precsim _{\mathcal {C}\mathcal {O}}^{\texttt{s}} t'\) and \(t'\precsim _{\mathcal {C}\mathcal {O}}^{\texttt{s}} t\).

The closed variants, simply called contextual pre-order \(\precsim _\mathcal {C}^{\texttt{s}}\) and contextual equivalence \(\simeq _\mathcal {C}^{\texttt{s}}\), are defined as above but restricting to contexts \(C\) such that \(C\langle t\rangle \) and \(C\langle t'\rangle \) are closed terms. We say that \(\texttt{s}\) is openable if \(\precsim _{\mathcal {C}\mathcal {O}}^{\texttt{s}}\) and \(\precsim _\mathcal {C}^{\texttt{s}}\) coincide.

It follows from the definitions that \(\precsim _{\mathcal {C}\mathcal {O}}^{\texttt{s}} \subseteq \precsim _\mathcal {C}^{\texttt{s}}\), and similarly for the equivalences, while the other direction is not obvious, and can indeed fail. For instance, if \(\mathtt {p_w}\) is weak evaluation in Plotkin’s CbV \(\lambda \)-calculus (to be defined in Sect. 5) and \(\delta {:}{=}\lambda z.zz\) then we have \(\varOmega _l{:}{=}(\lambda x.\delta ) (yy)\delta \precsim _\mathcal {C}^{\mathtt {p_w}} \delta \delta =: \varOmega \) but \(\varOmega _l\not \precsim _{\mathcal {C}\mathcal {O}}^{\mathtt {p_w}} \varOmega \). That is, \(\mathtt {p_w}\) is not openable. To our knowledge, the notion of openable strategy is new.

(In)Equational Theories. A relation is compatible if \(t~{\mathcal {R}}~u\) implies \(C\langle t\rangle ~{\mathcal {R}}~C\langle u\rangle \) for any context \(C\) and any terms \(t\) and \(u\). A term \(t\) is minimum for a pre-order \(\le \) if for all \(u\in \mathcal {L}\), \(t\le u\). We denote abstract inequational theories with the symbol \(\le _\mathcal {T}\) to distinguish them from known program pre-orders, denoted with \(\precsim _{\mathcal {P}}\).

Definition 2

(Inequational \(\texttt{s}\) -theory). Let \(\texttt{s}\) be an evaluation strategy. An inequational \(\texttt{s}\)-theory \(\le _\mathcal {T}^{\texttt{s}}\) is a compatible pre-order on terms containing \(\texttt{s}\)-conversion. An inequational \(\texttt{s}\)-theory \(\le _\mathcal {T}^{\texttt{s}}\) is called:

  • Consistent: whenever it does not relate all terms;

  • \(\texttt{s}\)-ground: if \(\texttt{s}\)-diverging terms are minimum terms for \(\le _\mathcal {T}^{\texttt{s}}\);

  • \(\texttt{s}\)-adequate: if \(t\le _\mathcal {T}^{\texttt{s}}u\) and \(t\) is \(\texttt{s}\)-normalizing implies \(u\) is \(\texttt{s}\)-normalizing.

The notions of \(\texttt{s}\)-ground and \(\texttt{s}\)-adequate theories generalize to an abstract and inequational framework the \(\lambda \)-calculus notions of sensible and semi-sensible theories (whose non-abstract inequational versions are studied in particular in the recent book by Barendregt and Manzonetto [16]), up to a very minor difference: the definitions in the literature sometimes also ask for consistency which we treat independently. An equational theory is a symmetric inequational theory.

Remark 1

Any open contextual pre-order \(\precsim _{\mathcal {C}\mathcal {O}}^{\texttt{s}}\) is \(\texttt{s}\)-adequate: if \(t\precsim _{\mathcal {C}\mathcal {O}}^{\texttt{s}}u\) then, by considering the empty context, \(t\) \(\texttt{s}\)-normalizing implies \(u\) \(\texttt{s}\)-normalizing. Closed contextual pre-orders, instead, are not necessarily adequate: for weak evaluation \(\mathtt {p_w}\) in Plotkin’s calculus, \(\varOmega _l \precsim _\mathcal {C}^{\mathtt {p_w}} \varOmega \), \(\varOmega _l\) is \(\mathtt {p_w}\)-normal, and \(\varOmega \) is \(\mathtt {p_w}\)-diverging.

Lastly, we show under which conditions the contextual pre-orders \(\precsim _{\mathcal {C}\mathcal {O}}^{\texttt{s}}\) and \(\precsim _\mathcal {C}^{\texttt{s}}\) are consistent inequational \(\texttt{s}\)-theories.

Proposition 1

Let \(\mathcal {L}\) be a confluent language and \(\texttt{s}\) be a normalizing and stabilizing strategy. Then \(\precsim _{\mathcal {C}\mathcal {O}}^{\texttt{s}}\) and \(\precsim _\mathcal {C}^{\texttt{s}}\) (resp. \(\simeq _{\mathcal {C}\mathcal {O}}^{\texttt{s}}\) and \(\simeq _\mathcal {C}^{\texttt{s}}\)) are inequational (resp. equational) \(\texttt{s}\)-theories. Moreover, if \(\texttt{s}\) is consistent then \(\precsim _{\mathcal {C}\mathcal {O}}^{\texttt{s}}\), \(\precsim _\mathcal {C}^{\texttt{s}}\), \(\simeq _{\mathcal {C}\mathcal {O}}^{\texttt{s}}\), and \(\simeq _\mathcal {C}^{\texttt{s}}\) are consistent.

3 Light Genericity and Collapsibility

As working notion of genericity, we adopt the following abstract light version.

Definition 3

(Light genericity). Let \(\texttt{s}\) be an evaluation strategy. Light \(\texttt{s}\)-genericity is the following property: if \(u\) is \(\texttt{s}\)-diverging and \(C\) is a context such that \(C\langle u\rangle \) is \(\texttt{s}\)-normalizing, then \(C\langle t\rangle \) is \(\texttt{s}\)-normalizing for all \(t\). Concisely: \(\texttt{s}\)-diverging terms are minimums for \(\precsim _{\mathcal {C}\mathcal {O}}^{\texttt{s}}\). Very concisely: \(\precsim _{\mathcal {C}\mathcal {O}}^{\texttt{s}}\) is \(\texttt{s}\)-ground.

We now show that light genericity is enough to obtain the main application of Barendregt’s heavier notion, that is, that \(\texttt{s}\)-diverging terms can be consistently equated (when \(\texttt{s}\) is consistent, which is a very mild hypothesis verified by all strategies of interest), by showing that they are contextually equivalent. In both the closed and open variants, independently of whether the strategy is openable.

Proposition 2

(Collapsibility). Let \(\texttt{s}\) be a consistent evaluation strategy satisfying light genericity. Then:

  1. 1.

    Open: \(\simeq _{\mathcal {C}\mathcal {O}}^{\texttt{s}}\) equates all \(\texttt{s}\)-diverging terms and it is consistent;

  2. 2.

    Closed: \(\simeq _\mathcal {C}^{\texttt{s}}\) equates all \(\texttt{s}\)-diverging terms and it is consistent.

Proof

  1. 1.

    By light genericity, \(\texttt{s}\)-diverging terms are minimums for \(\precsim _{\mathcal {C}\mathcal {O}}^{\texttt{s}}\). Since then any two \(\texttt{s}\)-diverging terms are \(\precsim _{\mathcal {C}\mathcal {O}}^{\texttt{s}}\)-smaller than each other, \(\texttt{s}\)-diverging terms are \(\simeq _{\mathcal {C}\mathcal {O}}^{\texttt{s}}\)-equivalent. Since \(\texttt{s}\) is consistent, \(\simeq _{\mathcal {C}\mathcal {O}}^{\texttt{s}}\) is consistent by Prop. 1.

  2. 2.

    Since \(\precsim _{\mathcal {C}\mathcal {O}}^{\texttt{s}}\subseteq \precsim _\mathcal {C}^{\texttt{s}}\), we obtain that light genericity implies that \(\texttt{s}\)-diverging terms are minimums for \(\precsim _\mathcal {C}^{\texttt{s}}\), and so \(\simeq _\mathcal {C}^{\texttt{s}}\) equates all \(\texttt{s}\)-diverging terms. Since \(\texttt{s}\) is consistent, \(\simeq _\mathcal {C}^{\texttt{s}}\) is consistent by Prop. 1.   \(\square \)

Proposition 3

(Characterization of minimum terms for \(\precsim _{\mathcal {C}\mathcal {O}}^{\texttt{s}}\)). Let \(\texttt{s}\) be a consistent evaluation strategy satisfying light genericity. Then the minimum terms for \(\precsim _{\mathcal {C}\mathcal {O}}^{\texttt{s}}\) are exactly the \(\texttt{s}\)-diverging terms.

Proof

By light genericity, \(\texttt{s}\)-diverging terms are minimums for \(\precsim _{\mathcal {C}\mathcal {O}}^{\texttt{s}}\). Conversely, by consistency of \(\texttt{s}\) there exists a \(\texttt{s}\)-diverging term \(t\). Let \(u\) be a minimum for \(\precsim _{\mathcal {C}\mathcal {O}}^{\texttt{s}}\). Then \(u\precsim _{\mathcal {C}\mathcal {O}}^{\texttt{s}} t\), hence \(u\) is \(\texttt{s}\)-diverging by \(\texttt{s}\)-adequacy of \(\precsim _{\mathcal {C}\mathcal {O}}^{\texttt{s}}\) (given by Remark 1).   \(\square \)

The characterization of minimum terms does not hold in the closed case, because the closed contextual pre-order is not necessarily adequate (Remark 1). For weak evaluation \(\mathtt {p_w}\) in Plotkin’s calculus, indeed, \(\varOmega _l\) is a minimum term for \(\precsim _\mathcal {C}^{\mathtt {p_w}}\) and it is \(\mathtt {p_w}\)-normal. The characterization lifts when \(\texttt{s}\) is openable.

Fig. 1.
figure 1

Call-by-Name calculus.

4 The Head Call-by-Name Case

Here we revisit two results from the theory of the \(\lambda \)-calculus, and use them to prove that head evaluation is openable. The first result is genericity for unsolvable terms—that is, head-diverging terms—for which we give a proof of light genericity. The second result is the maximality of the open head contextual pre-order.

The host language \(\mathcal {L}\) here is the \(\lambda \)-calculus and the evaluation strategy \(\texttt{s}\) is the head strategy \(\texttt{h}\). Both are defined in Fig. 1.

Solvability and Head Reduction. In the literature, the original notion of meaningful terms are the solvable ones, characterized by Wadsworth as those terminating for head reduction [46]; meaningless terms are their complement.

Definition 4

((Un)Solvable terms). A term \(t\) is solvable if there is a head context \(H\) such that \(H\langle t\rangle \rightarrow _{\beta }^* \texttt{I}=\lambda x.x\), and unsolvable otherwise.

Theorem 1

(Operational characterization of solvability, [46]). \(t\) is solvable (resp. unsolvable) if and only if \(t\) is \(\texttt{h}\)-normalizing (resp. \(\texttt{h}\)-diverging).

Apart from the proof of Thm. 4.1 below, we shall always use the operational characterization and never refer to solvability itself.

Head Contextual Pre-Orders are Inequational. It is well-known that the \(\lambda \)-calculus is confluent, that head normal forms are stable by reduction (that is, \(\texttt{h}\) is stabilizing), and that the following normalization theorem holds (for a recent simple proof of this classic result see Accattoli et al. [3]). These facts and Prop. 1 give that the contextual pre-orders are inequational \(\texttt{h}\)-theories.

Theorem 2

(Head normalization). If \(t\rightarrow _{\beta }^*t'\) and \(t'\) is \(\texttt{h}\)-normal then \(t\) is \(\texttt{h}\)-normalizing.

Proposition 4

The head pre-orders \(\precsim _{\mathcal {C}\mathcal {O}}^{\texttt{h}}\) and \(\precsim _\mathcal {C}^{\texttt{h}}\) are inequational \(\texttt{h}\)-theories.

Proofs of Genericity. In his book [14], Barendregt gives two proofs that \(\texttt{h}\)-diverging terms can be consistently equated, both using heavy genericity (defined in the introduction). A first one [14, Lemma 16.1.8-thm 16.1.9] uses it to show that the minimal equational theory equating them, noted \(\mathcal {H}\), is consistent. This proof is where the heavy part of genericity is used. A second proof [14, Lemma 16.2.3] exploits the consistency of \(\simeq _{\mathcal {C}\mathcal {O}}^{\texttt{h}}\) (noted \(\mathcal {H}^*\) in [14]), which is trivial, and uses genericity to show that \(\mathcal {H}\subseteq \simeq _{\mathcal {C}\mathcal {O}}^{\texttt{h}}\), i.e. that \(\simeq _{\mathcal {C}\mathcal {O}}^{\texttt{h}}\) equates all \(\texttt{h}\)-diverging terms.

The second proof in [14] uses heavy genericity, but the heavy aspect is in fact not needed for the proof to go through. The abstract result of the previous section, indeed, follows essentially the same reasoning and uses only light genericity.

We now prove light genericity for head reduction, via a direct proof, using the rewriting properties of head reduction.

Head Light Genericity via Takahashi’s Technique. Our proof of light genericity adapts Takahashi’s simple technique for heavy genericity [44]. We stress that two standard and crucial properties of head reduction are at work in Takahashi’s proof, despite the fact that she does not point them out, namely the head normalization theorem (Theorem 2) and the following property.

Proposition 5

(Head substitutivity). If \(t\rightarrow _{\texttt{h}}^* u\) then \(t\{x{\leftarrow }s\}\rightarrow _{\texttt{h}}^* u\{x{\leftarrow }s\}\), for all \(t,u,s\).

Firstly, we prove genericity for \(\texttt{h}\)-normal forms, via a simple induction on the structure of normal forms, using an auxiliary lemma [8, Lemma 4].

Proposition 6

(Normal genericity). Let \(u\) be \(\texttt{h}\)-diverging and \(s\) be any term.

  1. 1.

    If \(r\) is a rigid term and \(r\{x{\leftarrow }u\}\) is \(\texttt{h}\)-normalizing then \(r\{x{\leftarrow }s\}\) is a rigid term.

  2. 2.

    If \(h\) is \(\texttt{h}\)-normal and \(h\{x{\leftarrow }u\}\) is \(\texttt{h}\)-normalizing then \(h\{x{\leftarrow }s\}\) is \(\texttt{h}\)-normal.

We can now prove (light) genericity, which is done in two steps. The first one simply lifts \(\texttt{h}\)-normal genericity to non-\(\texttt{h}\)-normal terms, obtaining a substitution-based version of genericity. The second one turns the substitution-based statement into a context-based statement, and its proof is what we shall refer to as Takahashi’s trick. For the sake of clarity, note that the two statements are not immediately equivalent, since substitution is a capture-avoiding operation while context plugging may capture free variables.

Theorem 3

(Light genericity). Let \(u\) be \(\texttt{h}\)-diverging and \(s\) be any term.

  1. 1.

    Light genericity as substitution: if \(t\) is a term and \(t\{x{\leftarrow }u\}\) is \(\texttt{h}\)-normalizing then \(t\{x{\leftarrow }s\}\) is \(\texttt{h}\)-normalizing.

  2. 2.

    Light genericity as context: if \(C\) is a context and \(C\langle u\rangle \) is \(\texttt{h}\)-normalizing then \(C\langle s\rangle \) is \(\texttt{h}\)-normalizing.

Proof

  1. 1.

    It follows from Prop. 5 (precisely, via a lemma in [8, Lemma 4]), that if \(t\{x{\leftarrow }u\}\) is \(\texttt{h}\)-normalizing then so is \(t\). Then \(t\rightarrow _{\texttt{h}}^{*} h\) for some \(\texttt{h}\)-normal \(h\). Again, by stability of head reduction under substitutions, we have both \(t\{x{\leftarrow }u\}\rightarrow _{\texttt{h}}^{*} h\{x{\leftarrow }u\}\) and \(t\{x{\leftarrow }s\}\rightarrow _{\texttt{h}}^{*} h\{x{\leftarrow }s\}\). Note that \(t\{x{\leftarrow }u\}\) \(\texttt{h}\)-normalizing implies \(h\{x{\leftarrow }u\}\) \(\texttt{h}\)-normalizing. By normal genericity (Prop. 6), \(h\{x{\leftarrow }s\}\) is \(\texttt{h}\)-normal. Therefore, \(t\{x{\leftarrow }s\}\) is \(\texttt{h}\)-normalizing.

  2. 2.

    Let \(\texttt{fv}(u)\cup \texttt{fv}(s)= \{x_{1},\ldots ,x_{k}\}\), and \(y\) be a variable fresh with respect to \(\texttt{fv}(u)\cup \texttt{fv}(s)\cup \texttt{fv}(C)\) and not captured by \(C\). Note that \(\bar{u}{:}{=}\lambda x_{1}.\ldots \lambda x_{k}.u\) is a closed term. Consider \(t{:}{=}C\langle yx_{1} \ldots x_{k}\rangle \), and note that:

    $$\begin{aligned} \begin{array}{llllll} t\{y{\leftarrow }\bar{u}\} = C\langle \bar{u}x_{1} \ldots x_{k}\rangle = C\langle (\lambda x_{1}.\ldots \lambda x_{k}.u) x_{1} \ldots x_{k}\rangle \rightarrow _{\beta }^{k} C\langle u\rangle . \end{array} \end{aligned}$$

    The fact that \(u\) is \(\texttt{h}\)-diverging implies that \(\bar{u}\) is also \(\texttt{h}\)-diverging. If \(C\langle u\rangle \) is \(\texttt{h}\)-normalizing then so is \(t\{y{\leftarrow }\bar{u}\}\) by the \(\texttt{h}\)-normalization theorem (Theorem 2). By genericity as substitution, \(t\{y{\leftarrow }s'\}\) is \(\texttt{h}\)-normalizing for every \(s'\). In particular, take \(s' {:}{=}\bar{s}= \lambda x_{1}.\ldots \lambda x_{k}.s\), then \(t\{y{\leftarrow }\bar{s}\}\) \(\texttt{h}\)-normalizes to some \(h\) and note that \(t\{y{\leftarrow }\bar{s}\} \rightarrow _{\beta }^{*}C\langle s\rangle \). Since \(\beta \) is confluent and \(\texttt{h}\) is stabilizing, there exists a \(\texttt{h}\)-normal form \(h'\) such that \(h\rightarrow _{\beta }^*h'\) and \(C\langle s\rangle \rightarrow _{\beta }^*h'\). By the \(\texttt{h}\)-normalization theorem (Theorem 2), \(C\langle s\rangle \) is \(\texttt{h}\)-normalizing.   \(\square \)

Maximality of \(\precsim _{\mathcal {C}\mathcal {O}}^{\texttt{h}}\). Barendregt shows that \(\simeq _{\mathcal {C}\mathcal {O}}^{\texttt{h}}\) is a maximal consistent theory, that is, that equating more terms would yield an inconsistent theory [14, Thm 16.2.6]. Later on, Barendregt and Manzonetto refine the result for \(\precsim _{\mathcal {C}\mathcal {O}}^{\texttt{h}}\) [16], by using the same technique, which relies on Böhm theorem. We present here a new proof of maximality based only on light genericity and not needing Böhm theorem, which is a heavier property, thus obtaining an arguably simpler proof. It is inspired by the proof of maximality for CbV by Egidi et al. [19].

Theorem 4

  1. 1.

    Let \(\mathcal {T}\) be an inequational \(\texttt{h}\)-theory that is \(\texttt{h}\)-ground but not \(\texttt{h}\)-adequate. Then \(\mathcal {T}\) is inconsistent.

  2. 2.

    Maximality of \(\precsim _{\mathcal {C}\mathcal {O}}^{\texttt{h}}\): \(\precsim _{\mathcal {C}\mathcal {O}}^{\texttt{h}}\) is a maximal consistent inequational \(\texttt{h}\)-theory.

Proof

  1. 1.

    Since \(\mathcal {T}\) is not \(\texttt{h}\)-adequate, there are \(t\) \(\texttt{h}\)-normalizing and \(u\) \(\texttt{h}\)-diverging such that \(t\le _{\mathcal {T}} u\). Since \(t\) is \(\texttt{h}\)-normalizing, by solvability there is a head context \(H\) sending it to the identity \(\texttt{I}\). By the definition of inequational theory, we have \(\texttt{I}=_{\mathcal {T}} H\langle t\rangle \le _{\mathcal {T}} H\langle u\rangle \). Now, let \(s\) be a term. Then \(s=_{\mathcal {T}} \texttt{I}s\) because \(=_{\beta }\subseteq \mathcal {T}\) by definition of inequational theory. By the context closure of theories and \(\texttt{I}\le _{\mathcal {T}} H\langle u\rangle \), we obtain \(\texttt{I}s\le _{\mathcal {T}} H\langle u\rangle s\). Since \(u\) is \(\texttt{h}\)-diverging, thus unsolvable, \(H\langle u\rangle \) is \(\texttt{h}\)-diverging. Since \(\mathcal {T}\) is \(\texttt{h}\)-ground and both \(H\langle u\rangle \) and \(H\langle u\rangle s\) are \(\texttt{h}\)-diverging, \(H\langle u\rangle s=_{\mathcal {T}} H\langle u\rangle \). Summing up, \(s=_{\mathcal {T}} \texttt{I}s\le _{\mathcal {T}} H\langle u\rangle s=_{\mathcal {T}} H\langle u\rangle \) and, by the fact that \(\mathcal {T}\) is \(\texttt{h}\)-ground, \(H\langle u\rangle \le _{\mathcal {T}}s\). Hence, \(s=_{\mathcal {T}} H\langle u\rangle \) for every term \(s\), that is, \(\mathcal {T}\) is inconsistent.

  2. 2.

    Any theory \(\mathcal {T}\) extending \(\precsim _{\mathcal {C}\mathcal {O}}^{\texttt{h}}\) is such that \(t\le _{\mathcal {T}} u\) with \(t\not \precsim _{\mathcal {C}\mathcal {O}}^{\texttt{h}}u\), i.e. such that \(C\langle t\rangle \) is \(\texttt{h}\)-normalizing and \(C\langle u\rangle \) is \(\texttt{h}\)-diverging for some \(C\). By compatibility of \(\mathcal {T}\), \(C\langle t\rangle \le _\mathcal {T}C\langle u\rangle \). Hence \(\mathcal {T}\) is not \(\texttt{h}\)-adequate. Since \(\precsim _{\mathcal {C}\mathcal {O}}^{\texttt{h}}\) is \(\texttt{h}\)-ground by head light genericity (Theorem 3), every theory \(\mathcal {T}\) extending \(\precsim _{\mathcal {C}\mathcal {O}}^{\texttt{h}}\) is also \(\texttt{h}\)-ground. Then \(\mathcal {T}\) is \(\texttt{h}\)-ground and not \(\texttt{h}\)-adequate. By Point 1, \(\mathcal {T}\) is inconsistent.    \(\square \)

Maximality and Head is Openable. From maximality of \(\precsim _{\mathcal {C}\mathcal {O}}^{\texttt{h}}\) it elegantly follows that \(\precsim _{\mathcal {C}\mathcal {O}}^{\texttt{h}}\) and \(\precsim _\mathcal {C}^{\texttt{h}}\) coincide. To our knowledge, there is no such result in the literature but it is folklore for CbN. Note that, despite the apparently trivial proof that we provide below, the equivalence of \(\precsim _{\mathcal {C}\mathcal {O}}^{\texttt{h}}\) and \(\precsim _\mathcal {C}^{\texttt{h}}\) is not a trivial fact, as the crucial inclusion \(\precsim _\mathcal {C}^{\texttt{h}}\,\subseteq \,\precsim _{\mathcal {C}\mathcal {O}}^{\texttt{h}}\) cannot be proved directly from the definitions of the pre-orders—in our proof, the non-trivial aspect is encapsulated in the use of maximality. Paolini proves that closed theories can be uniquely extended to open terms [37], but this does not imply that the extension of the closed contextual pre-order coincides with the open contextual pre-order.

Proposition 7

(Head evaluation is openable). Open and closed head contextual pre-orders coincide: \(\precsim _{\mathcal {C}\mathcal {O}}^{\texttt{h}}\,=\, \precsim _\mathcal {C}^{\texttt{h}}\).

Proof

Firstly, \(\precsim _{\mathcal {C}\mathcal {O}}^{\texttt{h}}\,\subseteq \,\precsim _\mathcal {C}^{\texttt{h}}\) follows from the definitions. Secondly, by maximality of \(\precsim _{\mathcal {C}\mathcal {O}}^{\texttt{h}}\) (Theorem 4) and since \(\precsim _\mathcal {C}^{\texttt{h}}\) is consistent (because \(\texttt{I}\not \precsim _\mathcal {C}^{\texttt{h}}\varOmega \)), we have that the two pre-orders must coincide, i.e. \(\precsim _{\mathcal {C}\mathcal {O}}^{\texttt{h}}\,=\,\precsim _\mathcal {C}^{\texttt{h}}\).   \(\square \)

5 Weak Call-by-Value and the VSC

We now turn our attention to the CbV case, for which the literature has already extensively discussed two issues that arise when adapting the CbN case to Plotkin’s CbV \(\lambda \)-calculus, recalled after the definition of the calculus.

Fig. 2.
figure 2

Plotkin’s CbV and Weak Evaluation \(\mathtt {p_w}\).

Plotkin’s CbV \(\lambda \)-Calculus. Plotkin’s CbV \(\lambda \)-calculus is defined in Fig. 2, following the modern presentation by Dal Lago and Martini [18] rather than Plotkin’s original one [40]. We also define its weak evaluation strategy \(\rightarrow _{\mathtt {p_w}}\).

Issue 1: CbV Unsolvable Terms Are Not Collapsible. As pointed out by Accattoli and Guerrieri [6], the CbV variant of unsolvable terms is not a good notion of meaningless terms, as their identification induces an inconsistent equational theory. The solution amounts to switching to a different notion of meaningless terms, the inscrutable ones (that coincide with the non-potentially valuable terms of Paolini and Ronchi della Rocca [36, 38, 41]), which are collapsible [6].

Definition 5

(Testing contexts). Testing contexts are defined by the following grammar \(T:\,\!:=\langle \cdot \rangle \mid (\lambda x.T)t\mid Tt\).

Definition 6

((In)Scrutable terms). A term \(t\) is scrutable if there is a testing context \(T\) and a value \(v\) such that \(T\langle t\rangle \rightarrow _{{\beta _v}}^* v\), and inscrutable otherwise.

Issue 2: CbV Inscrutable Terms Have No Operational Characterization in Plotkin’s CbV. The term \(\varOmega _l{:}{=}(\lambda x.\delta ) (yy)\delta \) is inscrutable but \(\rightarrow _{{\beta _v}}\)-normal. Therefore, in Plotkin’s CbV there cannot be any operational characterization of inscrutable terms via a notion of divergence, as instead happens in CbN (Thm. 1). This fact is a real drawback, and boils down to the well-known inability of Plotkin’s calculus to deal with open terms, which is also the reason why—as we have pointed out in Sect. 2—the closed and open contextual notions induced by weak evaluation in Plotkin’s calculus do not coincide.

The solution amounts to switching to a refined CbV \(\lambda \)-calculus, extending Plotkin’s as to better deal with open terms while retaining the same notion of contextual equivalence, as we now explain.

Fig. 3.
figure 3

Weak Value Substitution Calculus.

The VSC. Accattoli and Paolini’s value substitution calculus (VSC) [9], defined in Figure 3, is exactly one such framework.

Intuitively, the VSC is a CbV \(\lambda \)-calculus extended with \(\textsf{let}\)-expressions, as is common for CbV \(\lambda \)-calculi such as Moggi’s one [33, 34]. We do however replace a \(\textsf{let}\)-expression \(\textsf{let}\ x=u\ \textsf{in}\ t\) with a more compact explicit substitution (ES for short) notation \(t[x{\leftarrow }u]\), which binds \(x\) in \(t\) and that has precedence over abstraction and application (that is, \(\lambda x.t[y{\leftarrow }u]\) stands for \(\lambda x.(t[y{\leftarrow }u])\) and \(ts[y{\leftarrow }u]\) for \(t(s[y{\leftarrow }u])\)). Moreover, our \(\textsf{let}\)/ES does not fix an order of evaluation between \(t\) and \(u\), in contrast to many papers in the literature (e.g. Sabry and Wadler [42] or Levy et al. [32]) where \(u\) is evaluated first.

The reduction rules of VSC are slightly unusual as they use contexts both to allow one to reduce redexes located in sub-terms, which is standard, and to define the redexes themselves, which is less standard—these kind of rules is called at a distance. The rationale is that the rewriting rules are designed to mimic cut-elimination on proof nets, via Girard’s CbV translation \((A \Rightarrow B)^v = ! (A^v \multimap B^v)\) of intuitionistic logic into linear logic [25], see Accattoli [2].

Examples of steps: \((\lambda x.y)[y{\leftarrow }t]u\mapsto _{\textsf{m}}y[x{\leftarrow }u][y{\leftarrow }t]\) and \((\lambda z.xx)[x{\leftarrow }y[y{\leftarrow }t]] \mapsto _{\textsf{e}}(\lambda z.yy)[y{\leftarrow }t]\). One with on-the-fly \(\alpha \)-renaming is \((\lambda x.y)[y{\leftarrow }t]y\mapsto _{\textsf{m}}z[x{\leftarrow }y][z{\leftarrow }t]\).

A key point is that \(\beta \)-redexes are decomposed via ESs, indeed \(\mapsto _{{\beta _v}}\) is simulated as \((\lambda x.t)v\mapsto _{\textsf{m}}t[x{\leftarrow }v]\mapsto _{\textsf{e}}t\{x{\leftarrow }v\}\). Note that the by-value restriction is on ES-redexes, not on \(\beta \)-redexes, because only values can be substituted. The VSC is a conservative refinement for both closed and open terms: its weak evaluation on closed terms terminates if and only if Plotkin’s \(\rightarrow _{\mathtt {p_w}}\) does, hence the closed contextual pre-orders coincide (Prop. 8.3 below). On open terms, the VSC can simulate every \(\rightarrow _{\mathtt {p_w}}\) step, but not vice-versa (which is why we adopt the VSC).

The Characterization of Inscrutable Terms. In the VSC, (in)scrutable terms admit an operational characterization, due to Accattoli and Paolini [9].

Theorem 5

(Operational characterization of (in)scrutability, [9]). \(t\) is scrutable (resp. inscrutable) if and only if \(t\) is \(\texttt{w}\)-normalizing (resp. \(\texttt{w}\)-diverging).

Apart from the proof of Thm. 8 below and Prop. 15 in Section 10, we shall always use the operational characterization and never refer to scrutability itself.

Weak Contextual Pre-Orders Are Inequational. The VSC is confluent and its weak strategy \(\texttt{w}\) is diamond [9]. Moreover, \(\texttt{w}\) is stabilizing and the normalization theorem below holds. These facts and Prop. 1 give that the contextual pre-orders are inequational \(\texttt{w}\)-theories. Moreover, the closed pre-order coincides with the one on Plotkin’s calculusFootnote 2.

Proposition 8

  1. 1.

    Weak normalization, [6]: if \(t\rightarrow _{\texttt{vsc}}^* t'\) and \(t'\) is \(\texttt{w}\)-normal then \(t\) is \(\texttt{w}\)-normalizing.

  2. 2.

    Inequational theories: \(\precsim _{\mathcal {C}\mathcal {O}}^{\texttt{w}}\) and \(\precsim _\mathcal {C}^{\texttt{w}}\) are inequational \(\texttt{w}\)-theories.

  3. 3.

    VSC and Plotkin’s contextual pre-orders coincide, [6]: on \(\lambda \)-terms, \(\precsim _\mathcal {C}^{\texttt{w}}= \precsim _\mathcal {C}^{\mathtt {p_w}}\).

6 Light Genericity for Weak Call-by-Value

Here, we prove a new result: light genericity for weak evaluation in the VSC.

Takahashi’s Technique Does Not Really Scale Up. Proving CbV light genericity via Takahashi’s technique is not as elegant as for CbN. We did develop such a proof, but it is considerably more involved than for CbN. There are various reasons. Firstly, the substitutivity property of Prop. 5 does not hold in CbV. Substitutivity for values does hold, but one really needs general substitutivity. Secondly, Takahashi’s trick lifting genericity as substitutions to genericity as contexts also breaks, because it is based on adding abstractions, which do not change unsolvability but do affect inscrutability. Thirdly, head reduction reduces only on the head, while weak reduction reduces in all sub-terms out of abstractions, which raises additional difficulties. Therefore, we follow a different approach.

Light Genericity via Multi Types. We provide a proof of light genericity relying on Accattoli and Guerrieri’s characterization of \(\texttt{w}\)-diverging terms [6] via Ehrhard’s CbV multi types [20] (multi types are also known as non-idempotent intersection types). The idea behind the proof is very simple: we show that multi types induce a pre-order \(\precsim _{type}\) contained in the open contextual pre-order, that is, \(\precsim _{type}\subseteq \precsim _{\mathcal {C}\mathcal {O}}^{\texttt{w}}\), and that \(\texttt{w}\)-diverging terms are minimum elements for \(\precsim _{type}\), which implies that they are minimums for \(\precsim _{\mathcal {C}\mathcal {O}}^{\texttt{w}}\). The proof itself is very simple as well. What is less simple is the characterization of \(\texttt{w}\)-diverging terms via multi types, which however we use as a black box from the literature. The same technique can be used also in CbN, since \(\texttt{h}\)-diverging terms can also be characterized via multi types.

Our argument via multi types is similar to Ghilezan’s one based on intersection types for CbN [24], even if the details are quite different: she proves a different statement, namely heavy genericity in its as-application variant (see the footnote at page 5), and she uses intersection types (which are idempotent, or non-linear). We use multi types because the result from the literature that we exploit is based on them, but the proof technique could also be based on intersection types (once the result from the literature is adapted, which is possible).

Fig. 4.
figure 4

Call-by-Value Multi Type System for VSC.

CbV Multi Types. We introduce the bare minimum about CbV multi types, since here they are used only as a tool, not as an object of study. For more, see [5, 6].

The definition of the multi type system for the VSC is in Figure 4. Multi types \(M\) are defined by mutual induction with linear types \(L\). Multi types are finite multi-sets \([{L}_{1},\ldots ,{L}_{n}]\), which intuitively denote the intersection \(L_1 \cap \ldots \cap L_n\), where the intersection \(\cap \) is a commutative, associative and non-idempotent (\(A \cap A \not = A\)) operator, the neutral element of which is \([~]\), the empty multi set. Note that there is no ground type, its role is played by the empty multi type \([~]\).

Typing judgments have shape \(\varGamma \vdash t\!:\!T\) where T is a linear or a multi type and \(\varGamma \) is a typing context, that is, an assignment of multi types to a finite set of variables (\(\varGamma = x_1 \!:\!M_1, \ldots , x_n \!:\!M_n\)). A typing derivation \(\pi \vartriangleright \varGamma \vdash t\!:\!M\) is a tree built from the rules in Figure 4 which ends with the typing judgment \(\varGamma \vdash t\!:\!M\).

Typing Rules. Linear types only type values, via the rules \(\textsf{ax}\) and \(\lambda \). To give a multi type to value \(v\), one has to use the \(\textsf{many}\) rule, turning an indexed family of linear types for \(v\) into a multi type. Note that any value can be typed with the empty multi type \([~]\). The symbol \(\uplus \) is the disjoint union operator on multi sets (corresponding to the intersection operator when intersections are multi-sets).

Characterization of Termination. The key property of CbV multi types is that typability characterizes termination with respect to weak evaluation \(\rightarrow _{\texttt{w}}\); therefore \(\texttt{w}\)-diverging terms are simply the untypable ones. The characterization is proved via subject reduction and expansion.

Theorem 6

(Characterization of termination, [6]).

  1. 1.

    Subject reduction and expansion: let \(t\rightarrow _{\texttt{vsc}}u\). Then \(\varGamma \vdash t\!:\!M\) iff \(\varGamma \vdash u\!:\!M\).

  2. 2.

    \(t\) is \(\rightarrow _{\texttt{w}}\)-normalizing if and only if there exists \(\varGamma \) and \(M\) such that \(\varGamma \vdash t\!:\!M\).

Type Pre-order. The type pre-order is defined as follows.

Definition 7

(Type pre-order). The type pre-order \(t\precsim _{type}t'\) holds if \(\varGamma \vdash t\!:\!M\) implies \(\varGamma \vdash t'\!:\!M\) for all \(\varGamma \) and \(M\).

Point 2 of Thm. 6 ensures that \(\precsim _{type}\) is both \(\texttt{w}\)-ground—which is the key point of the proof technique—and \(\texttt{w}\)-adequate. We also show that \(\precsim _{type}\) is an inequational \(\texttt{w}\)-theory. Point 1 of Thm. 6 implies that \(\precsim _{type}\) contains \(\texttt{w}\)-conversion. Compatibility holds because \(\precsim _{type}\) is defined via a compositional type system.

Proposition 9

The type pre-order \(\precsim _{type}\) is a \(\texttt{w}\)-ground, \(\texttt{w}\)-adequate, and consistent inequational \(\texttt{w}\)-theory.

Adequacy and compatibility of \(\precsim _{type}\) imply that \(\precsim _{type}\subseteq \precsim _{\mathcal {C}\mathcal {O}}^{\texttt{w}}\), hence minimum elements of \(\precsim _{type}\) are minimum for \(\precsim _{\mathcal {C}\mathcal {O}}^{\texttt{w}}\).

Theorem 7

Light genericity for \(\texttt{w}\): \(\precsim _{\mathcal {C}\mathcal {O}}^{\texttt{w}}\) is \(\texttt{w}\)-ground.

7 CbV Maximality

Here, we use light genericity to prove maximality of \(\precsim _{\mathcal {C}\mathcal {O}}^{\texttt{w}}\) and the fact that \(\texttt{w}\) is openable, adapting the proofs for the head case.

Maximality of \(\precsim _{\mathcal {C}\mathcal {O}}^{\texttt{w}}\). The following result adapts to our setting a result of Accattoli and Guerrieri [6, Thm 6.5], itself adapting a result by Egidi et al. [19, Prop 35].

Theorem 8

  1. 1.

    Any \(\texttt{w}\)-ground inequational theory \(\mathcal {T}\) that is not \(\texttt{w}\)-adequate is inconsistent.

  2. 2.

    Maximality of \(\precsim _{\mathcal {C}\mathcal {O}}^{\texttt{w}}\): \(\precsim _{\mathcal {C}\mathcal {O}}^{\texttt{w}}\) is a maximal consistent inequational theory.

Proof

  1. 1.

    Since \(\mathcal {T}\) is not \(\texttt{w}\)-adequate, there are \(t\) \(\texttt{w}\)-normalizing and \(u\) \(\texttt{w}\)-diverging such that \(t\le _{\mathcal {T}} u\). Since \(t\) is \(\texttt{w}\)-normalizing, \(t\) is scrutable, that is, there is a testing context \(T\) sending it to a value \(v\). By the definition of inequational \(\texttt{w}\)-theory, we have \( v=_{\mathcal {T}} T\langle t\rangle \le _{\mathcal {T}} T\langle u\rangle \). Now, let \(s\) be a term and \(y\notin \texttt{fv}(s)\). Then \(s=_{\mathcal {T}} (\lambda y.s) v\) because \(=_{\texttt{vsc}}\subseteq =_\mathcal {T}\) by definition of inequational theory. By the compatibility of theories and \(v\le _{\mathcal {T}} T\langle u\rangle \), we obtain \((\lambda y.s) v\le _{\mathcal {T}} (\lambda y.s) T\langle u\rangle \). Since \(u\) is \(\texttt{w}\)-diverging, thus inscrutable, \(T\langle u\rangle \) is also \(\texttt{w}\)-diverging. Since \(\mathcal {T}\) is \(\texttt{w}\)-ground and both \(T\langle u\rangle \) and \((\lambda y.s) T\langle u\rangle \) are \(\texttt{w}\)-diverging, \((\lambda y.s) T\langle u\rangle =_{\mathcal {T}} T\langle u\rangle \). Summing up, \(s=_{\mathcal {T}} (\lambda y.s) v\le _{\mathcal {T}} (\lambda y.s) T\langle u\rangle \le _{\mathcal {T}} T\langle u\rangle \) and, since \(\mathcal {T}\) is \(\texttt{w}\)-ground, \(T\langle u\rangle \le _{\mathcal {T}}s\). Hence, \(s=_{\mathcal {T}} T\langle u\rangle \) for every term \(s\), that is, \(\mathcal {T}\) is inconsistent.

  2. 2.

    From Point 1 and CbV light genericity (Thm. 7.3), as in the head case.    \(\square \)

The proof of Thm. 8.1 is similar to the one of the CbN case, but it is not the same argument: the CbN one relies on solvability, reduction to the identity, and head context closure; the CbV one relies on scrutability, reduction to a value, a different context closure, and on the fact that diverging arguments cannot be erased in CbV. Therefore, our proofs of maximality cannot be done abstractly.

The fact that weak evaluation is openable then follows as in the head case.

Proposition 10

(Weak evaluation is openable in the VSC). Open and closed weak contextual pre-orders coincide: \(\precsim _{\mathcal {C}\mathcal {O}}^{\texttt{w}}\,=\, \precsim _\mathcal {C}^{\texttt{w}}\).

8 Co-Genericity

Here, we study a new notion dual to light genericity, which we dub co-genericity.

\(\texttt{s}\)-Super Terms. In the \(\lambda \)-calculus (both in CbN and CbV) there are terms reducing to an infinite sequence of abstractions using strong evaluation. For instance, let \(\delta _{\lambda }{:}{=}\lambda x.\lambda y.xx\), then \(\varOmega _{\lambda }{:}{=}\delta _{\lambda }\delta _{\lambda }\) is one such term. Indeed its weak evaluation gives \( \varOmega _{\lambda } \mapsto _{{\beta _v}}\lambda y.\varOmega _{\lambda }\). Now, the new copy of \(\varOmega _{\lambda }\) shall itself (strongly) reduce to \(\lambda y.\varOmega _{\lambda }\), and so on, producing \(\lambda y.\lambda y.\lambda y.\ldots \). Such a behavior, when seen with respect to weak evaluation, is a form of hereditary, or super normalization.

Note that the example can be generalized by using \(\delta _{k\lambda }{:}{=}\lambda x.\lambda y_1.\ldots \lambda y_k.xx\) instead of \(\delta _\lambda \), obtaining a family of terms \(\varOmega _{k\lambda }{:}{=}\delta _{k\lambda }\delta _{k\lambda }\) all producing infinitely many head abstractions and with no (finite) reduct in common. As for meaningless terms, it is natural to wonder whether these super meaningful terms can all be consistently collapsed. In the literature, super terms appear in weak CbN as maximum (\(\top \)) elements in Lévy-Longo trees [31]—but we are not aware of a proof that these \(\top \)-enriched Lévy-Longo trees induce a consistent equational theory—and in the hierarchy of unsolvable terms [1, 35] as unsolvable terms of order \(\infty \). In CbV, we believe that super terms have not been studied.

Here we connect the collapsibility of super terms to a sort of dual variant of light genericity. We start by setting up the concept of super normalization abstractly. It is specific to weak strategies and makes sense also for weak CbN.

Definition 8

(\(\texttt{s}\) -super terms). Let \(\texttt{s}\) be a weak strategy. A term \(t\) is \(\texttt{s}\)-super (normalizing) if, co-inductively, \(t\rightarrow _{\texttt{s}}^* \lambda x.t'\) and \(t'\) is \(\texttt{s}\)-super.

Co-genericity is the property stating that \(\texttt{s}\)-super terms are maximum elements for \(\precsim _{\mathcal {C}\mathcal {O}}^{\texttt{s}}\), that shall be captured by the following notion of being \(\texttt{s}\)-roof. As expected, a term \(t\) is maximum for a pre-order \(\le \) if for all \(u\in \mathcal {L}\), \(u\le t\).

Definition 9

Let \(\texttt{s}\) be a weak strategy. An inequational \(\texttt{s}\)-theory \(\le _\mathcal {T}^{\texttt{s}}\) is called:

  1. 1.

    \(\texttt{s}\)-roof: if \(\texttt{s}\)-super terms are maximum terms for \(\le _\mathcal {T}^{\texttt{s}}\);

  2. 2.

    Super \(\texttt{s}\)-adequate: if \(t\le _\mathcal {T}^{\texttt{s}}u\) and \(t\) is \(\texttt{s}\)-super entails \(u\) is \(\texttt{s}\)-super.

Definition 10

(Co-genericity). Let \(\texttt{s}\) be a weak strategy. Co-\(\texttt{s}\)-genericity is the following property: if \(u\) is \(\texttt{s}\)-super and \(C\) is a context such that \(C\langle t\rangle \) is \(\texttt{s}\)-normalizing for some \(t\), then \(C\langle u\rangle \) is \(\texttt{s}\)-normalizing. Concisely: \(\texttt{s}\)-super terms are maximum for \(\precsim _{\mathcal {C}\mathcal {O}}^{\texttt{s}}\). Very concisely: \(\precsim _{\mathcal {C}\mathcal {O}}^{\texttt{s}}\) is \(\texttt{s}\)-roof.

Note that there cannot be a heavy co-genericity property mentioning strong normal forms because \(\texttt{s}\)-super terms are diverging for strong \(\texttt{s}\)-evaluation, by definition. Co-genericity is thus enabled by the switch from heavy to light genericity.

As for light genericity, co-genericity is enough to prove that \(\texttt{s}\)-super terms can be consistently equated (as soon as \(\texttt{s}\) is consistent).

Proposition 11

(Co-collapsibility). Let \(\texttt{s}\) be a consistent weak strategy satisfying co-genericity. Then \(\simeq _{\mathcal {C}\mathcal {O}}^{\texttt{s}}\) equates all \(\texttt{s}\)-super terms and it is consistent.

A weak strategy \(\texttt{s}\) is super consistent if there exists a \(\texttt{s}\)-super term.

Proposition 12

(Characterization of maximum terms for \(\precsim _{\mathcal {C}\mathcal {O}}^{\texttt{s}}\)). Let \(\texttt{s}\) be a super consistent weak strategy satisfying co-genericity. If \(\precsim _{\mathcal {C}\mathcal {O}}^{\texttt{s}}\) is super \(\texttt{s}\)-adequate then the maximum terms for \(\precsim _{\mathcal {C}\mathcal {O}}^{\texttt{s}}\) are exactly the \(\texttt{s}\)-super terms.

Proof

By co-genericity, \(\texttt{s}\)-super terms are maximal for \(\precsim _{\mathcal {C}\mathcal {O}}^{\texttt{s}}\). For the other direction, let \(t\) be a \(\texttt{s}\)-super term, which exists by super consistency of \(\texttt{s}\), and let \(u\) be maximal for \(\precsim _{\mathcal {C}\mathcal {O}}^{\texttt{s}}\). Then \(t\precsim _{\mathcal {C}\mathcal {O}}^{\texttt{s}} u\). By super \(\texttt{s}\)-adequacy, \(u\) is \(\texttt{s}\)-super.   \(\square \)

The two following sections present independent proofs of co-genericity for weak evaluation in the VSC. We do not use multi types for good reasons: \(\texttt{w}\)-super terms are not maximum for \(\precsim _{type}\), see the technical report [8, Prop. 18].

9 CbV Co-Genericity via Takahashi’s Technique

In this section, we prove co-genericity for weak evaluation in the VSC adapting Takahashi’s technique for genericity.

Co-Genericity via Normal Forms. The proof of co-genericity for CbV is based on a key property of \(\texttt{w}\)-super terms with respect to \(\texttt{w}\)-normal forms, akin to the normal genericity lemma of the CbN case. Then co-genericity follows via Takahashi’s trick, which is not problematic here, since \(\texttt{w}\)-super terms are stable by adding head abstractions. Another difficulty arises in CbV, however, which is discussed in the technical report [8, p.27] before the proof of the following lemma.

Lemma 1

(Key property of \(\texttt{w}\) -super terms). Let \(s\) be a \(\texttt{w}\)-super term. If \(n\) is a \(\texttt{w}\)-normal form then \(n\{x{\leftarrow }s\}\) is \(\texttt{w}\)-normalizing.

As CbV evaluation only validate value-substitutivity (substitutivity restricted to values: if \(t\rightarrow _{\texttt{w}}u\) then for all \(v\) \(t\{x{\leftarrow }v\}\rightarrow _{\texttt{w}}u\{x{\leftarrow }v\}\)), the statement of co-genericity as substitution is split into two points.

Lemma 2

(Co-genericity). Let \(u\) be any term, \(s\) be a \(\texttt{w}\)-super term, \(v\) be any value, and \(v'\) be a \(\texttt{w}\)-super value.

  1. 1.

    Co-genericity as v-substitution: if \(t\{x{\leftarrow }v\}\) is \(\texttt{w}\)-normalizing then so is \(t\{x{\leftarrow }v'\}\).

  2. 2.

    Co-genericity as substitution: if \(t\{x{\leftarrow }u\}\) is \(\texttt{w}\)-normalizing then so is \(t\{x{\leftarrow }s\}\).

  3. 3.

    Co-genericity as context: if \(C\langle u\rangle \) is \(\texttt{w}\)-normalizing then so is \(C\langle s\rangle \).

Super \(\texttt{w}\)-Adequacy for \(\precsim _{\mathcal {C}\mathcal {O}}^{\texttt{w}}\). Co-genericity states that \(\texttt{w}\)-super terms are maximal for \(\precsim _{\mathcal {C}\mathcal {O}}^{\texttt{w}}\). For the full characterization (Prop. 12), we need super adequacy and super consistency. Super consistency is easily verified as \(\varOmega _\lambda \) exists.

Proposition 13

(Super \(\texttt{w}\) -Adequacy).

  1. 1.

    Super adequacy: \(\precsim _{\mathcal {C}\mathcal {O}}^{\texttt{w}}\) is super \(\texttt{w}\)-adequate.

  2. 2.

    Characterization of maximum terms for \(\precsim _{\mathcal {C}\mathcal {O}}^{\texttt{w}}\): maximum terms for \(\precsim _{\mathcal {C}\mathcal {O}}^{\texttt{w}}\) are exactly \(\texttt{w}\)-super terms.

10 CbV (Co-)Genericity via Applicative Similarity

In this section, we present alternative proofs of genericity and co-genericity for weak evaluation in the VSC. We use a well-known tool developed to study Plotkin’s CbV contextual equivalence \(\simeq _\mathcal {C}^{\mathtt {p_w}}\), namely the CbV variant [27, 39] of Abramsky’s applicative (bi)similarity [1].

The following definition differs slightly from the literature on two points. Firstly, we use a well known equivalent definition that does not ask that the results of evaluation are similar (which is a fact needed for the definition of applicative simulations, but not for applicative similarity). Secondly, we replace Plotkin’s CbV by the VSC, which are equivalent for closed terms.

Definition 11

(Applicative similarity [1]). Applicative similarity \(t\precsim _{AS}^{\texttt{w}}u\) is the relation on closed terms defined by: if \(t\,v_1\ldots v_n\) is \(\texttt{w}\)-normalizing then \(u\,v_1\ldots v_n\) is \(\texttt{w}\)-normalizing, for all \( n\in \mathbb {N}\) and \(v_1,\ldots ,v_n\) closed values. Applicative similarity is extended to open terms via closing substitutions: \(t\precsim _{AS}^{\texttt{w}}u\) if \(t\sigma \precsim _{AS}^{\texttt{w}}u\sigma \) for all substitutions of values \(\sigma \) closing \(t\) and \(u\).

From the following lemma, it follows easily that \(\texttt{w}\)-diverging and \(\texttt{w}\)-super terms are minimum and maximum for \(\precsim _{AS}^{\texttt{w}}\).

Lemma 3

If \(t\) is \(\texttt{w}\)-diverging (resp. \(\texttt{w}\)-super) then so are \(t\{x{\leftarrow }v\}\) and \(tv\).

Proposition 14

  1. 1.

    Minimums: \(\texttt{w}\)-diverging terms are minimum for \(\precsim _{AS}^{\texttt{w}}\).

  2. 2.

    Maximums: \(\texttt{w}\)-super terms are maximum for \(\precsim _{AS}^{\texttt{w}}\).

Proof

  1. 1.

    Let \(t\) be a \(\texttt{w}\)-diverging term and \(u\) any term. Then by Lemma 3, for any closing substitution \(\sigma \) of \(t\) and \(u\) and for any n and any \(v_1,\ldots ,v_n\) we still have that \(t\sigma \,v_1,\ldots ,v_n\) is \(\texttt{w}\)-diverging. Hence \(t\precsim _{AS}^{\texttt{w}}u\) for any term \(u\), that is, \(t\) is a minimum term for \(\precsim _{AS}^{\texttt{w}}\).

  2. 2.

    Let \(t\) be a \(\texttt{w}\)-super term and \(u\) any term. For any closing substitution \(\sigma \) of \(t\) and \(u\) and for any n and values \(v_1,\ldots ,v_n\), either \(u\sigma \,v_1,\ldots ,v_n\) is \(\texttt{w}\)-diverging or \(u\sigma \,v_1,\ldots ,v_n\) is \(\texttt{w}\)-normalizing. In both cases, by Lemma 3, we still have that \(t\sigma \,v_1,\ldots ,v_n\) is \(\texttt{w}\)-super, hence \(\texttt{w}\)-normalizing. Thus, \(u\precsim _{AS}^{\texttt{w}}t\).   \(\square \)

Proving (co-)genericity amounts to show that the results of the previous proposition transfer to \(\precsim _{\mathcal {C}\mathcal {O}}^{\texttt{w}}\). This can be done by showing \(\precsim _{AS}^{\texttt{w}}\subseteq \precsim _{\mathcal {C}\mathcal {O}}^{\texttt{w}}\) via:

  1. 1.

    The soundness of applicative similarity \(\precsim _{AS}^{\texttt{w}}\) for Plotkin’s pre-order \(\precsim _\mathcal {C}^{\mathtt {p_w}}\), that is, that \(\precsim _{AS}^{\texttt{w}}\subseteq \precsim _\mathcal {C}^{\mathtt {p_w}}\) (completeness holds as well, but it is not useful here);

  2. 2.

    The equivalence \(\precsim _\mathcal {C}^{\mathtt {p_w}}= \precsim _\mathcal {C}^{\texttt{w}}\), given by Prop. 8.3;

  3. 3.

    The openability of \(\texttt{w}\)-evaluation, that is, \(\precsim _\mathcal {C}^{\texttt{w}}= \precsim _{\mathcal {C}\mathcal {O}}^{\texttt{w}}\).

Soundness of \(\precsim _{AS}^{\texttt{w}}\) is a non-trivial result in the literature, established by Howe’s method [27, 39], which we here use as a black box. About openability, we proved it in Sect. 7 but that proof uses light genericity (and maximality), which is our goal here, so we have to re-prove openability without using light genericity.

\(\texttt{w}\) is Openable without Light Genericity. We know that \(\precsim _{\mathcal {C}\mathcal {O}}^{\texttt{w}}\subseteq \precsim _\mathcal {C}^{\texttt{w}}\), thus we only have to show the other inclusion, which follows from \(\texttt{w}\)-adequacy of \(\precsim _\mathcal {C}^{\texttt{w}}\).

Proposition 15

The inequational theory \(\precsim _\mathcal {C}^{\texttt{w}}\) is \(\texttt{w}\)-adequate, hence \(\texttt{w}\) is openable.

Proof

The proof is in [8, p.32], here we only give the idea for \(\texttt{w}\)-adequacy. Let \(t\precsim _\mathcal {C}^{\texttt{w}}u\) with \(t\) \(\texttt{w}\)-normalizing. Then, we use the operational characterization of scrutability (Thm. 5) to build a closing context \(C\) such that \(C\langle t\rangle \) is \(\texttt{w}\)-normalizing and such that if \(u\) were \(\texttt{w}\)-diverging, so would be \(C\langle u\rangle \).   \(\square \)

(Co-)genericity via Applicative Similarity. The three points above are established, and so we obtain new proofs of light genericity and co-genericity.

Proposition 16

(CbV light (co-)genericity). \(\precsim _{\mathcal {C}\mathcal {O}}^{\texttt{w}}\) is \(\texttt{w}\)-ground and \(\texttt{w}\)-roof.

11 Conclusions

We develop in this paper a theory of light genericity, which is as powerful as heavy genericity for proving the collapsibility of meaningless terms, it is connected to contextual pre-orders, and dualizable as co-genericity.

We also provide light proofs of the maximality of open contextual pre-orders, which in turn provide an elegant proof of the fact that the closed and open contextual pre-orders coincide. Lastly, we show that CbV applicative similarity can be used for alternative simple proofs of light (co-)genericity. These simple proofs via applicative similarity are easily adaptable to the (weak) CbN case.

Summing up, our work paints Barendregt’s genericity with a fresh, modern hue, connecting it to program equivalences and maximality, following an abstract approach and providing neat proofs.