1 Introduction

Logical relations are arguably the most widely used method for reasoning on higher-order languages. Historically, early examples of logical relations [44, 46, 47, 51, 55, 56, 58, 59] were based on denotational semantics, before the method evolved into logical relations based on operational semantics [7, 17, 34, 50, 52, 53]. Today, operationally-based logical relations are ubiquitous and serve purposes ranging from strong normalization proofs [6] and safety properties [21, 22] to reasoning about contextual equivalence [5, 60] and formally verified compilation [8, 33, 45, 48], in a variety of settings such as effectful [37], probabilistic [4, 10, 63], and differential programming [15, 40, 41].

Unfortunately, despite the extensive literature, there is a distinct lack of a general formal theory of (operational) logical relations. As a reasoning method, logical relations are applied in a largely empirical manner, more so because their core principles are well understood on an intuitive level. For example, there is typically no formal notion of a logical predicate or relation; instead, if a predicate or relation is defined by induction on types and maps “related inputs to related outputs”, it then meets the informal criteria to be called “logical”. However, the empirical character of logical relations is problematic for two main reasons: (i) complex machinery associated to logical relations needs to be re-established anew on a per-case basis, and (ii) it is hard to abstract and simplify said machinery, even though certain parts of proofs via logical relations seem generic.

Recently, Higher-order Mathematical Operational Semantics [24], or higher-order abstract GSOS, has emerged as a unifying approach to the operational semantics of higher-order languages. In this framework, languages are represented as higher-order GSOS laws, a form of distributive law of a syntax functor \(\varSigma \) over a mixed-variance behaviour bifunctor B. In further work [62], an abstract form of Howe’s method [16, 31, 32] for higher-order abstract GSOS has been identified, in which an otherwise complex and application-specific operational technique is, at the same time, lifted to an appropriate level of generality and reduced to a simple lax bialgebra condition.

In the present paper, we work towards establishing a theory of logical relations based on coalgebra and higher-order abstract GSOS, starting from logical predicates, understood as unary logical relations. In more detail, we present the following contributions:

  1. (i)

    A systematization of the method of logical predicates (Section 3), achieved by

    1. (a)

      identifying logical predicates as certain coalgebraic invariants (Definition 12), parametric in a predicate lifting of the underlying mixed-variance bifunctor,

    2. (b)

      introducing the locally maximal logical refinement \(\square P\) of a predicate P (Definition 14), which enables inductive proofs of \(\square P\), and

    3. (c)

      identifying an abstract setting in which locally maximal logical refinements of predicates exist and are unique (Section 3.3).

  2. (ii)

    The development of efficient reasoning techniques on logical predicates, which we call induction up-to (Theorems 34 and 36), for higher-order GSOS laws satisfying a relative flatness condition (Definition 30).

We illustrate (ii) by providing proofs of strong normalization for typed combinatory logic and type safety for the simply typed \(\lambda \)-calculus which, thanks to the use of our up-to techniques, are significantly shorter and simpler than standard arguments found in the literature. Finally, we exploit the genericity of our framework to study strong normalization on the level of higher-order GSOS laws (Theorem 42). We note that the implementation of typed languages as higher-order GSOS laws as such is also novel.

Full proofs and additional details can be found in the arXiv version [25] of our paper.

Related work While denotational logical relations have been studied in categorical generality, e.g. [27,28,29, 38], general abstract foundations of operational logical relations are far less developed. In recent work [13, 14], Dagnino and Gavazzo introduce a categorical notion of operational logical relations that is largely orthogonal to ours, in particular regarding the parametrization of the framework: In op. cit., the authors work with a fixed fine-grain call-by-value language [42], parametrized by a signature of generic effects, while the notion of logical relation is kept variable and in fact is parametrized over a fibration; contrastingly, we keep to the traditional notion of logical relation but parametrize over the syntax and semantics of the language. Moreover, we work with a small-step operational semantics, whereas the semantics used in op. cit. is an axiomatically defined categorical evaluation semantics.

2 Preliminaries

2.1 Category Theory

Familiarity with basic category theory [43] (e.g. functors, natural transformations, (co)limits, monads) is assumed. We review some concepts and notation.

Notation. Given objects \(X_1, X_2\) in a category \({\mathcal {C}}\), we write \(X_1\times X_2\) for the product and \(\langle f_1, f_2\rangle :X\rightarrow X_1\times X_2\) for the pairing of \(f_i:X\rightarrow X_i\), \(i=1,2\). We let \(X_1+X_2\) denote the coproduct, \({\mathop {\textsf{inl}}}:X_1\rightarrow X_1+X_2\) and \({\mathop {\textsf{inr}}}:X_2\rightarrow X_1+X_2\) the injections, \([g_1,g_2]:X_1+X_2\rightarrow X\) the copairing of \(g_i:X_i\rightarrow X\), \(i=1,2\), and \(\nabla =[{\textsf{id}}_X,{\textsf{id}}_X]:X+X\rightarrow X\) the codiagonal. The slice category \({\mathcal {C}}/X\), where \(X\in {\mathcal {C}}\), has as objects all pairs \((Y,p_Y)\) of an object \(Y\in {\mathcal {C}}\) and a morphism \(p_Y:Y\rightarrow X\), and a morphism from \((Y,p_Y)\) to \((Z,p_Z)\) is a morphism \(f:Y\rightarrow Z\) of \({\mathcal {C}}\) such that \(p_Y = p_Z\cdot f\). The coslice category \(X/{\mathcal {C}}\) is defined dually.

Extensive categories. A category \({\mathcal {C}}\) is (finitely) extensive [12] if it has finite coproducts and for every finite family of objects \(X_i\) (\(i\in I\)) the functor \(E:\prod _{i\in I} {\mathcal {C}}/X_i \rightarrow {\mathcal {C}}/\coprod _{i\in I} X_i\) sending \((p_i:Y_i\rightarrow X_i)_{i\in I}\) to \(\coprod _{i\in I} p_i:\coprod _i Y_i \rightarrow \coprod _i X_i\) is an equivalence of categories. A countably extensive category satisfies the analogous property for countable coproducts. In extensive categories, coproduct injections \({\mathop {\textsf{inl}}},{\mathop {\textsf{inr}}}\) are monic, and coproducts of monomorphisms are monic; generally, coproducts behave like disjoint unions of sets.

Example 1

Examples of countably extensive categories include the category \({\textbf {Set}}\) of sets and functions; the category \({\textbf {Set}}^{{\mathcal {C}}}\) of presheaves on a small category \({\mathcal {C}}\) and natural transformations; and the categories of posets and monotone maps, nominal sets and equivariant maps, and metric spaces and non-expansive maps, respectively.

Algebras. Given an endofunctor F on a category \({\mathcal {C}}\), an F-algebra is a pair (Aa) consisting of an object A and a morphism \(a:FA\rightarrow A\) (the structure). A morphism from (Aa) to an F-algebra (Bb) is a morphism \(h:A\rightarrow B\) of \({\mathcal {C}}\) such that \(h\cdot a = b\cdot Fh\). Algebras for F and their morphisms form a category \({{\,\mathrm{{\textbf {Alg}}}\,}}(F)\), and an initial F-algebra is simply an initial object in that category. We denote the initial F-algebra by \(\mu F\) if it exists, and its structure by \(\iota :F(\mu F) \rightarrow \mu F\). Initial algebras admit the structural induction principle: the algebra \(\mu F\) has no proper subalgebras, that is, every F-algebra monomorphism \(m:(A,a)\rightarrowtail (\mu F,\iota )\) is an isomorphism.

More generally, a free F-algebra on an object X of \({\mathcal {C}}\) is an F-algebra \((F^{\star }X,\iota _X)\) together with a morphism \(\eta _X:X\rightarrow F^{\star }X\) of \({\mathcal {C}}\) such that for every algebra (Aa) and every \(h:X\rightarrow A\) in \({\mathcal {C}}\), there exists a unique F-algebra morphism \(h^\sharp :(F^{\star }X,\iota _X)\rightarrow (A,a)\) such that \(h=h^\sharp \cdot \eta _X\). If free algebras exist on every object, their formation induces a monad \(F^{\star }:{\mathcal {C}}\rightarrow {\mathcal {C}}\), the free monad generated by F. Every F-algebra (Aa) yields an Eilenberg-Moore algebra \(\widehat{a} :F^{\star } A \rightarrow A\) as the free extension of \({\textsf{id}}_A:A\rightarrow A\).

The most familiar example of functor algebras are algebras for a signature. Given a set S of sorts, an S-sorted algebraic signature consists of a set \(\varSigma \) of operation symbols together with a map \(\textsf{ar}:\varSigma \rightarrow S^{\star }\times S\) associating to every \({\mathop {\textsf{f}}}\in \varSigma \) its arity. We write \({\mathop {\textsf{f}}}:s_1\times \cdots \times s_n\rightarrow s\) if \(\textsf{ar}({\mathop {\textsf{f}}})=(s_1,\ldots ,s_n,s)\), and \({\mathop {\textsf{f}}}:s\) if \(n=0\) (in which case \({\mathop {\textsf{f}}}\) is called a constant). Every signature \(\varSigma \) induces a polynomial functor on the category \({\textbf {Set}}^S\) of S-sorted sets, denoted by the same letter \(\varSigma \), given by \((\varSigma X)_s = \coprod _{{\mathop {\textsf{f}}}:s_1\cdots s_n\rightarrow s} \prod _{i=1}^n X_{s_i}\) for \(X\in {\textbf {Set}}^S\) and \(s\in S\). An algebra for the functor \(\varSigma \) is precisely an algebra for the signature \(\varSigma \), viz. an S-sorted set \(A=(A_s)_{s\in S}\) in \({\textbf {Set}}^S\) equipped with an operation \({\mathop {\textsf{f}}}^A:\prod _{i=1}^n A_{s_i}\rightarrow A_s\) for every \({\mathop {\textsf{f}}}:s_1\cdots s_n\rightarrow s\) in \(\varSigma \). Morphisms of \(\varSigma \)-algebras are S-sorted maps respecting the algebraic structure. Given an S-sorted set X of variables, the free algebra \(\varSigma ^{\star }X\) is the \(\varSigma \)-algebra of \(\varSigma \)-terms with variables from X; more precisely, \((\varSigma ^{\star }X)_s\) is inductively defined by \(X_s\subseteq (\varSigma ^{\star }X)_s\) and \({\mathop {\textsf{f}}}(t_1,\ldots ,t_n)\in (\varSigma ^{\star }X)_s\) for all \({\mathop {\textsf{f}}}:s_1\cdots s_n\rightarrow s\) and \(t_i\in (\varSigma ^{\star }X)_{s_i}\). In particular, the free algebra on the empty set is the initial algebra \(\mu \varSigma \); it is formed by all closed terms of the signature. For every \(\varSigma \)-algebra (Aa), the induced Eilenberg-Moore algebra \(\widehat{a}:\varSigma ^{\star }A \rightarrow A\) is given by the map that evaluates terms over A in the algebra A.

Coalgebras. Dual to the notion of algebra, a coalgebra for an endofunctor F on \({\mathcal {C}}\) is a pair (Cc) of an object C (the state space) and a morphism \(c:C\rightarrow FC\) (the structure).

2.2 Higher-Order Abstract GSOS

We summarize the framework of higher-order abstract GSOS [24], which extends the original, first-order counterpart introduced by Turi and Plotkin [61]. In higher-order abstract GSOS, the operational semantics of a higher-order language is presented in the form of a higher-order GSOS law, a categorical structure parametric in

  1. (1)

    a category \({\mathcal {C}}\) with finite products and coproducts;

  2. (2)

    an object \(V\in {\mathcal {C}}\) of variables;

  3. (3)

    an endofunctor \(\varSigma :{\mathcal {C}}\rightarrow {\mathcal {C}}\), where \(\varSigma = V+ \varSigma '\) for some endofunctor \(\varSigma '\), such that free \(\varSigma \)-algebras exist on every object (hence \(\varSigma \) generates a free monad \(\varSigma ^{\star }\));

  4. (4)

    a mixed-variance bifunctor \(B:{\mathcal {C}}^\textsf{op}\times {\mathcal {C}}\rightarrow {\mathcal {C}}\).

The functors \(\varSigma \) and B represent the syntax and the behaviour of a higher-order language. The motivation behind B having two arguments is that transitions have labels, which behave contravariantly, and poststates, which behave covariantly; in term models the objects of labels and states will coincide. The presence of an object V of variables is a technical requirement for the modelling of languages with variable binding [19, 20], such as the \(\lambda \)-calculus. An object of \(V/{\mathcal {C}}\), the coslice category of \(V\)-pointed objects, is thought of as a set X of programs with an embedding \(p_X:V\rightarrow X\) of the variables. In point-free calculi, e.g. xTCL as introduced below, we put \(V=0\) (the initial object).

Definition 2

A (\(V\)-pointed) higher-order GSOS law of \(\varSigma \) over B is a family of morphisms (1) that is dinatural in \((X,p_X) \in V/{\mathcal {C}}\) and natural in \(Y\in {\mathcal {C}}\):

$$\begin{aligned} \varrho _{(X,p_X),Y} :\varSigma (X \times B(X,Y))\rightarrow B(X, \varSigma ^\star (X+Y)) \end{aligned}$$
(1)

Notation 3

  1. (i)

    In (1), we have implicitly applied the forgetful functor \(V/{\mathcal {C}}\rightarrow C\) at \((X,p_X)\). In addition, we write \(\varrho _{X,Y}\) for \(\varrho _{(X,p_X),Y}\) if the point \(p_X\) is clear from the context.

  2. (ii)

    For \((A,a)\in {{\,\mathrm{{\textbf {Alg}}}\,}}(\varSigma )\), we view A as \(V\)-pointed by .

Informally, \(\varrho _{X,Y}\) assigns to an operation of the language with formal arguments from X having specified next-step behaviours in B(XY) (i.e. with labels in X and formal poststates in Y) a next-step behaviour in \(B(X, \varSigma ^\star (X+Y))\), i.e. with the same labels, and with poststates being program terms mentioning variables from both X and Y. Every higher-order GSOS law (1) induces a canonical operational model \(\gamma :{\mu \varSigma }\rightarrow B({\mu \varSigma },{\mu \varSigma })\), viz. a \(B({\mu \varSigma },-)\)-coalgebra on the initial algebra \({\mu \varSigma }\), defined by primitive recursion [36, Prop. 2.4.7] as the unique morphism \(\gamma \) making the following diagram commute:

figure b

Here, we regard the initial algebra \(({\mu \varSigma },\iota )\) as V-pointed as explained in Notation 3.

Simply Typed SKI Calculus. We illustrate the ideas behind higher-order abstract GSOS with an extended version of the simply typed SKI calculus [30], a typed combinatory logic which we call xTCL. It is expressively equivalent to the simply typed \(\lambda \)-calculus but does not use variables; hence it avoids the complexities associated to variable binding and substitution in the \(\lambda \)-calculus, which we treat in Section 4.2. The set \(\textsf{Ty}\) of types is inductively defined as

$$\begin{aligned} \textsf{Ty}:\,\!:=\textsf{unit}\mid \textsf{Ty} \rightarrowtriangle \textsf{Ty}. \end{aligned}$$
(2)

The constructor \( \rightarrowtriangle \) is right-associative, i.e. \(\tau _{1} \rightarrowtriangle \tau _{2} \rightarrowtriangle \tau _{3}\) is parsed as \(\tau _{1} \rightarrowtriangle ({\tau _{2} \rightarrowtriangle \tau _{3}})\). The terms of \({\textbf {xTCL}} \) are formed over the \(\textsf{Ty}\)-sorted signature \(\varSigma \) whose operation symbols are listed below, with \(\tau ,\tau _1,\tau _2,\tau _3\) ranging over all types in \(\textsf{Ty}\):

$$\begin{aligned} \begin{array}{ll} \textsf{e} :\textsf{unit}&{} \textsf{app}_{\tau _1,\tau _2}:(\tau _{1} \rightarrowtriangle \tau _{2})\times \tau _1\rightarrow \tau _2 \\ S_{\tau _{1},\tau _{2},\tau _{3}} :(\tau _{1} \rightarrowtriangle \tau _{2} \rightarrowtriangle \tau _{3}) \rightarrowtriangle (\tau _{1} \rightarrowtriangle \tau _{2}) \rightarrowtriangle \tau _{1} \rightarrowtriangle \tau _{3} &{} K_{\tau _{1},\tau _{2}} :\tau _{1} \rightarrowtriangle \tau _{2} \rightarrowtriangle \tau _{1} \\ S'_{\tau _1,\tau _2,\tau _3}:(\tau _{1} \rightarrowtriangle \tau _{2} \rightarrowtriangle \tau _{3})\rightarrow ({(\tau _{1} \rightarrowtriangle \tau _{2})} \rightarrowtriangle \tau _{1} \rightarrowtriangle \tau _{3}) &{} K'_{\tau _1,\tau _2}:\tau _1\rightarrow (\tau _{2} \rightarrowtriangle \tau _{1}) \\ S''_{\tau _1,\tau _2,\tau _3}:(\tau _{1} \rightarrowtriangle \tau _{2} \rightarrowtriangle \tau _{3})\times (\tau _{1} \rightarrowtriangle \tau _{2})\rightarrow (\tau _{1} \rightarrowtriangle \tau _{3}) &{} I_{\tau } :\tau \rightarrowtriangle \tau \end{array} \end{aligned}$$
Fig. 1.
figure 1

(Call-by-name) operational semantics of xTCL.

We let \(\textsf{Tr}={\mu \varSigma }\) denote the \(\textsf{Ty}\)-sorted set of closed \(\varSigma \)-terms. Informally, \(\textsf{app}\) represents function application (we write \(s\, t\) for \(\textsf{app}(s,t)\)), and the constants \(I_\tau \), \(K_{\tau _1,\tau _2}\), \(S_{\tau _1,\tau _2,\tau _3}\) represent the \(\lambda \)-terms \(\lambda t.\,t\), \(\lambda t.\,\lambda s.\, t\) and \(\lambda t.\,\lambda s.\,\lambda u.\, (s\, u)\, (t\, u)\), respectively. The operational semantics of xTCL involves three kinds of transitions: , and . It is presented in Figure 1; here, \(p,p',q,t\) range over terms in \(\textsf{Tr}\) of appropriate type. Intuitively, identifies s as an explicitly irreducible term; states that s acts as a function mapping t to r; and \(s\rightarrow t\) indicates that s reduces to t. Our use of labelled transitions in higher-order operational semantics is inspired by work on bisimilarity in the \(\lambda \)-calculus [1, 26]. The use of \(K'\), \(S'\) and \(S''\) does not impact the behaviour of programs, except for possibly adding more unlabelled transitions. For example, the standard rule \(S t{} s{} e\rightarrow (te)(se)\) for the S-combinator is rendered as the chain of transitions \( S t{} s{} e\rightarrow S'(t)\, s e\rightarrow S''(t,s)\, e\rightarrow (te)(se).\) The transition system for xTCL is deterministic: for every term s, either , or there exists a unique t such that \(s\rightarrow t\), or for each appropriately typed t there exists a unique \(s_t\) such that . Therefore, given

$$\begin{aligned} B_\tau (X,Y) &= Y_\tau + D_\tau (X,Y),\end{aligned}$$
(3)
$$\begin{aligned} \qquad D_{\textsf{unit}}(X,Y) &= 1= \{*\}\qquad \text {and} \qquad D_{\tau _{1} \rightarrowtriangle \tau _{2}}(X,Y) = Y_{\tau _{2}}^{X_{\tau _{1}}}, \end{aligned}$$
(4)

the operational rules in Figure 1 determine a \({\textbf {Set}}^\textsf{Ty}\)-morphism \(\gamma :\textsf{Tr}\rightarrow B(\textsf{Tr},\textsf{Tr})\):

(5)

Proposition 4

The object assignments (3) and (4) extend to mixed-variance bifunctors

$$\begin{aligned} B,D :({\textbf {Set}}^{\textsf{Ty}})^{\textsf{op}} \times {\textbf {Set}}^{\textsf{Ty}} \rightarrow {\textbf {Set}}^{\textsf{Ty}}. \end{aligned}$$
(6)

The semantics of xTCL in Figure 1 corresponds to a (0-pointed) higher-order GSOS law of the syntax functor \(\varSigma \) over the behaviour bifunctor B, i.e. to a family of maps (1) dinatural in \(X\in {\textbf {Set}}^\textsf{Ty}\) and natural in \(Y\in {\textbf {Set}}^\textsf{Ty}\). The maps \(\varrho _{X,Y}\) are cotuples defined by distinguishing cases on the constructors \(\textsf{e}, S, S', S'', K, K', I, \textsf{app}\) of xTCL, and each component of \(\varrho \) is determined by the rules that apply to the corresponding constructor. We provide a few illustrative cases; see [25, p. 25], for a complete definition.

$$\begin{aligned} \varrho _{X,Y} :\varSigma (X \times B(X,Y)) & \rightarrow B(X, \varSigma ^\star (X+Y)) {} & {} \end{aligned}$$
(7)
$$\begin{aligned} \varrho _{X,Y}~ (S''_{\tau _{1},\tau _{2},\tau _{3}}((p,f),(q,g))) & = \lambda t.\,(p \, t)\,(q \, t){} & {} \end{aligned}$$
(8)
$$\begin{aligned} \varrho _{X,Y}~ ((p,f)\,(q,g)) & = f(q){} & {} \text {if }f :Y_{\tau _{2}}^{X_{\tau _{1}}} \end{aligned}$$
(9)
$$\begin{aligned} \varrho _{X,Y}~ ((p,f)\,(q,g)) & = f q {} & {} \text {if }f :Y_{\tau _{1} \rightarrowtriangle \tau _{2}}{} & {} \end{aligned}$$
(10)

The operational model \(\gamma :\textsf{Tr}\rightarrow B(\textsf{Tr},\textsf{Tr})\) of \(\varrho \) coincides with the coalgebra (5).

Remark 5

The rules for application in Figure 1 implement the call-by-name evaluation strategy. Other strategies can be captured by varying the rules and consequently the corresponding higher-order GSOS law. For the call-by-value strategy, one replaces the last rule with (11) and (12) below and modifies clause (9) in the definition of \(\varrho \) accordingly. One can also model the traditional view of combinatory logic as a rewrite system [30] where any redex can be reduced, no matter how deeply. This amounts to specifying a maximally nondeterministic strategy by adding the rule (13) below to Figure 1. Notably, this makes the operational model nondeterministic, and hence the corresponding higher-order GSOS law relies on the behaviour functor \(\mathcal {P}B\) instead of the original B given by (3), where \(\mathcal {P}\) is the powerset functor.

figure j

3 Coalgebraic Logical Predicates

3.1 Predicate Lifting

Predicates and relations on coalgebras are often most conveniently modelled through predicate and relation liftings [39] of the underlying type functors. In the following we introduce a framework of predicate liftings for mixed-variance bifunctors, adapting existing notions of relation lifting [62], which enables reasoning about “higher-order” coalgebras, such as operational models of higher-order GSOS laws. The following global assumptions ensure that predicates and relations behave in an expected manner:

Assumptions 6

From now on, we fix \({\mathcal {C}}\) to be a complete, well-powered and extensive category in which, additionally, strong epimorphisms are stable under pullbacks.

The categories of Example 1 satisfy these assumptions. Since \({\mathcal {C}}\) is complete and well-powered, every morphism f admits a (strong epi, mono)-factorization \(f=m\cdot e\) [11, Prop. 4.4.3]; we call m the image of f. The category \({\textbf {Pred}}_{}({\mathcal {C}})\) of predicates over \({\mathcal {C}}\) has as objects all monics (predicates) \(P \overset{}{\rightarrowtail } X\) from \({\mathcal {C}}\), and as morphisms \((p :P \overset{}{\rightarrowtail } X) \rightarrow (q :Q \overset{}{\rightarrowtail } Y)\) all pairs \((f :X \rightarrow Y, f|_P :P \rightarrow Q)\) such that \(q\cdot f|_P = f\cdot p\) (so \(f|_P\) is uniquely determined by f). (Co)products in \({\textbf {Pred}}_{}({\mathcal {C}})\) are lifted from \({\mathcal {C}}\). The fiber \({\textbf {Pred}}_{X}({\mathcal {C}})\) is the subcategory of all monics \(P\overset{}{\rightarrowtail } X\) for fixed X and morphisms \(({\textsf{id}}_X,-)\). It is is preordered by \(p\le q\) if p factors through q; identifying pq if \(p\le q\) and \(q\le p\), we regard \({\textbf {Pred}}_{X}({\mathcal {C}})\) as a poset. Since \({\mathcal {C}}\) is complete and well-powered, \({\textbf {Pred}}_{X}({\mathcal {C}})\) is a complete lattice; we write \(\bigwedge \) for meets (i.e. pullbacks) and \(\bigvee \) for joins. We will also write \(f{}^\star [P]\) for the inverse image of a predicate \(p:P\overset{}{\rightarrowtail } X\) under \(f:Y\rightarrow X\), i.e. the pullback of p along f. The direct image \(f{}_\star [Q]\) of \(q:Q\overset{}{\rightarrowtail } Y\) under \(f:Y\rightarrow X\) is the image of the composite \(f\cdot p:Q\rightarrow X\). This yields an adjunction between \({\textbf {Pred}}_{X}({\mathcal {C}})\) and \({\textbf {Pred}}_{Y}({\mathcal {C}})\), i.e. \(Q\le f{}^\star [P]\) iff \(f{}_\star [Q]\le P\).

A predicate lifting of an endofunctor \({\varSigma :{\mathcal {C}}\rightarrow {\mathcal {C}}}\) is an endofunctor \(\overline{\varSigma }:{\textbf {Pred}}_{}({\mathcal {C}})\rightarrow {\textbf {Pred}}_{}({\mathcal {C}})\) making the left-hand diagram below commute; similarly, a predicate lifting of a mixed-variance bifunctor \({B:{\mathcal {C}}^\textsf{op}\times {\mathcal {C}}\rightarrow {\mathcal {C}}}\) is a bifunctor \(\overline{B} :{\textbf {Pred}}_{}({\mathcal {C}})^\textsf{op}\times {\textbf {Pred}}_{}({\mathcal {C}}) \rightarrow {\textbf {Pred}}_{}({\mathcal {C}})\) making the right-hand diagram below commute. Here \(|-|\) is the forgetful functor sending \(p:P\overset{}{\rightarrowtail } X\) to X.

(14)

We denote by \(\overline{\varSigma }\) both the action on predicates and on the corresponding objects in \({\mathcal {C}}\), i.e. \(\overline{\varSigma }(p:P\overset{}{\rightarrowtail } X):\overline{\varSigma } P\overset{}{\rightarrowtail } \varSigma X\).

Every endofunctor \(\varSigma \) on \({\mathcal {C}}\) admits a canonical predicate lifting \(\overline{\varSigma }\) mapping \(p :P \overset{}{\rightarrowtail } X\) to the image \(\overline{\varSigma }p :{\overline{\varSigma }P \overset{}{\rightarrowtail } \varSigma X}\) of \(\varSigma p:\varSigma P\rightarrow \varSigma X\) [36]. Note that \(\overline{\varSigma }p=\varSigma p\) if \(\varSigma \) preserves monos. In the remainder we will only consider canonical liftings of endofunctors.

Proposition 7

If \(\varSigma \) preserves strong epis, then \( \overline{\varSigma }^\star = \overline{\varSigma ^{\star }}. \)

The canonical predicate liftings for mixed-variance bifunctors are slightly more complex. Similarly to the case of relation liftings of such functors developed in recent work [62], their construction involves suitable pullbacks.

Proposition 8

Every bifunctor \(B:{\mathcal {C}}^\textsf{op}\times {\mathcal {C}}\rightarrow {\mathcal {C}}\) admits a canonical predicate lifting \(\overline{B}:{\textbf {Pred}}_{}({\mathcal {C}})^\textsf{op}\times {\textbf {Pred}}_{}({\mathcal {C}})\rightarrow {\textbf {Pred}}_{}({\mathcal {C}})\) sending \((p:P\rightarrowtail X,\, q:Q\rightarrowtail Y)\) to the predicate \(m_{P,Q}:\overline{B}(P,Q)\rightarrowtail B(X,Y)\), the image of the morphism \(r_{P,Q}\) given by the pullback below:

(15)

If B preserves monos in the covariant argument, then \(B({\textsf{id}},q)\) is monic and, since monos are pullback-stable, \(\overline{B}(P,Q)\) is simply the predicate \(r_{P,Q}:T_{P,Q}\rightarrowtail B(X,Y)\).

Example 9

The bifunctors B and D of (3) and (4) have canonical predicate liftings

$$\begin{aligned} \overline{B}_\tau (P,Q) & = Q_\tau + \overline{D}_\tau (P,Q)\quad \text {where} \end{aligned}$$
(16)
$$\begin{aligned} \overline{D}_{\textsf{unit}}(P,Q) &=1, \quad \overline{D}_{\tau _{1} \rightarrowtriangle \tau _{2}}(P,Q) = \{f:{X_{\tau _{1}}}\rightarrow {Y_{\tau _{2}}}\mid \forall x \in P_{\tau _{1}}.\, f(x)\in Q_{\tau _{2}} \}\subseteq Y_{\tau _{2}}^{X_{\tau _{1}}}. \end{aligned}$$
(17)

Predicate liftings allow us to generalize coalgebraic invariants [36, §6.2], viz. predicates on the state space of a coalgebra that are closed under the coalgebra structure in a suitable sense, from endofunctors to mixed-variance bifunctors:

Notation 10

For the remainder of the paper, we fix a mixed-variance bifunctor \(B:{\mathcal {C}}^\textsf{op}\times {\mathcal {C}}\rightarrow {\mathcal {C}}\) and a predicate lifting \(\overline{B}:{\textbf {Pred}}_{}({\mathcal {C}})^\textsf{op}\times {\textbf {Pred}}_{}({\mathcal {C}})\rightarrow {\textbf {Pred}}_{}({\mathcal {C}})\).

Definition 11

(Coalgebraic invariant). Let \(c :Y \rightarrow B(X,Y)\) be a \(B(X,-)\)-coalgebra. Given predicates \(S \overset{}{\rightarrowtail } X\), \(P \overset{}{\rightarrowtail } Y\), we say that P is an S-relative (\(\overline{B}\)-)invariant (for c) if \(P \le c{}^\star [\overline{B}(S,P)]\), equivalently, \(c{}_\star [P] \le \overline{B}(S,P)\). (Mention of \(\overline{B}\) is usually omitted.)

Coalgebraic invariants will feature centrally in our notion of logical predicate.

3.2 Logical Predicates via Lifted Bifunctors

As a reasoning device, the method of logical predicates (which are unary logical relations) typically applies to the following scenario: One has an operational semantics on an inductively defined set \({\mu \varSigma }\) of \(\varSigma \)-terms and a target predicate \(P \overset{}{\rightarrowtail } {\mu \varSigma }\) to be proved, in the sense that one wants to show \(P = {\mu \varSigma }\). Logical predicates come into play when a direct proof of \(P = {\mu \varSigma }\) by structural induction is not possible. The classical example of such a predicate is strong normalization [23, 59]. The idea is to strengthen P, obtaining a predicate featuring a certain “logical” structure that does allow for a proof by induction. We now develop this scenario in our abstract bifunctorial setting.

Definition 12

(Coalgebraic logical predicate). Suppose that \(c :X \rightarrow B(X,X)\) is a \(B(X,-)\) coalgebra with state space X. A predicate \(P \overset{}{\rightarrowtail } X\) is logical (for c) if it is a P-relative \(\overline{B}\)-invariant (as per Def. 11), i.e. \(P \le c{}^\star [\overline{B}(P,P)]\), equivalently, \(c{}_\star [P] \le \overline{B}(P,P)\).

In applications, c is the operational model \(\gamma :{\mu \varSigma }\rightarrow B({\mu \varSigma },{\mu \varSigma })\) of a higher-order language, or some coalgebra derived from it.

The self-referential nature of logical predicates (as relative to themselves) is meant to cater for the property that “inputs in P are mapped to outputs in P”. The following example from xTCL illustrates this:

Example 13

For B given by (3) and its canonical lifting \(\overline{B}\), a predicate \(P \overset{}{\rightarrowtail } \textsf{Tr}\) is logical for the operational model \(\gamma :\textsf{Tr}\rightarrow B(\textsf{Tr},\textsf{Tr})\) from (5) if \(\gamma {}_\star [P] \le {\overline{B}(P,P)}\), that is,

$$\begin{aligned} \begin{aligned} (\gamma _{\textsf{unit}}){}_\star [P_{\textsf{unit}}] \le &\; P_{\textsf{unit}} + 1, \\ \forall \tau _{1},\tau _{2}.\, (\gamma _{\tau _{1} \rightarrowtriangle \tau _{2}}){}_\star [P_{\tau _{1} \rightarrowtriangle \tau _{2}}] \le &\; P_{\tau _{1} \rightarrowtriangle \tau _{2}} + \{f:{\textsf{Tr}_{\tau _{1}}}\rightarrow {\textsf{Tr}_{\tau _{2}}}\mid \forall s \in P_{\tau _{1}}.\, f (s)\in P_{\tau _{2}} \}, \end{aligned} \end{aligned}$$

using the description of \(\overline{B}\) from Example 9. More explicitly, this means that

  • if \(s\in P_{\tau }\) and \(s\rightarrow t\) then \(t\in P_{\tau }\);

  • if \(s\in P_{\tau _{1} \rightarrowtriangle \tau _{2}}\) and , then \(t\in P_{\tau _1}\) implies \(u\in P_{\tau _2}\).

As we can see in the second clause, function terms that satisfy P produce outputs that satisfy P on all inputs that satisfy P. This is the key property of any logical predicate.

Defining a suitable logical predicate (or relation) is the centerpiece of various sophisticated arguments in higher-order settings. One standard application of logical predicates are proofs of strong normalization, which we now illustrate in the case of \({\textbf {xTCL}} \). For the operational model \(\gamma :\textsf{Tr}\rightarrow B(\textsf{Tr},\textsf{Tr})\) and terms rst of compatible type, put

  • \(s\mathrel {\Rightarrow }t\) if \(s=s_0\rightarrow s_1\rightarrow \cdots \rightarrow s_n=t\) for some \(n\ge 0\) and terms \(s_0,\ldots ,s_n\);

  • if \(s\mathrel {\Rightarrow }s'\) and for some (unique) \(s'\);

  • \({\Downarrow }(s)\) if \(s\mathrel {\Rightarrow }s'\) and \(\gamma (s')\in D(\textsf{Tr},\textsf{Tr})\) for some (unique) \(s'\).

Coalgebraically, this associates a weak operational model \(\widetilde{\gamma } :\textsf{Tr}\rightarrow \mathcal {P}B(\textsf{Tr},\textsf{Tr})\) to \(\gamma \), where \(\widetilde{\gamma }(t)=\{t'\mid t\mathrel {\Rightarrow }t'\}\cup \{\gamma (t')\mid t\mathrel {\Rightarrow }t',\gamma (t')\in D(\textsf{Tr},\textsf{Tr})\}\).

Strong normalization of \({\textbf {xTCL}} \) asserts that \({\Downarrow }=\textsf{Tr}\): every term eventually reduces to a function or explicitly terminates. We now devise three different logical predicates on \(\textsf{Tr}\), each of which provides a proof of that property. The idea is to refine the target predicate \({\Downarrow }\rightarrowtail \textsf{Tr}\) to a logical predicate, for which showing that it is totally true will be facilitated by its invariance w.r.t. a corresponding coalgebra structure. Our first example will be based on the following notion of refinement:

Definition 14

(Locally maximal logical refinement). Let \(c:X\rightarrow B(X,X)\) be a coalgebra and let \(P\rightarrowtail X\) be a predicate. A predicate \(\square {P}\rightarrowtail X\) is a locally maximal logical refinement of P if (i) \(\square {P} \le P\), (ii) \(\square {P}\) is logical (i.e. a \(\square {P}\)-relative \(\overline{B}\)-invariant), and (iii) for every predicate \(Q\le P\) that is a \(\square {P}\)-relative \(\overline{B}\)-invariant, one has \(Q\le \square {P}\).

Example 15

We define the predicate \(\square {\Downarrow }\rightarrowtail \textsf{Tr}\), i.e. a family of subsets \(\square {\Downarrow }_{\tau }\subseteq \textsf{Tr}_\tau \) (\(\tau \in \textsf{Ty}\)), by induction on the structure of the type \(\tau \): we put \(\square {\Downarrow }_{\textsf{unit}}={\Downarrow }_\textsf{unit}\), and we take \(\square {\Downarrow }_{\tau _{1} \rightarrowtriangle \tau _{2}}\) to be the greatest subset of \(\textsf{Tr}_{\tau _{1} \rightarrowtriangle \tau _{2}}\) satisfying

figure n

From this definition it is not difficult to verify by induction on the type that

$$\begin{aligned} \square {\Downarrow }\text { is a locally maximal logical refinement of }\Downarrow \text {.} \end{aligned}$$
(18)

Our goal is to show that \(\square {\Downarrow }\) is a subalgebra of \({\mu \varSigma }\), equivalently \(\overline{\varSigma } (\square {\Downarrow }) \le \iota {}^\star [\square {\Downarrow }]\), which then implies \(\square {\Downarrow }= \textsf{Tr}\) and hence \({\Downarrow }=\textsf{Tr}\) by structural induction. Taking the partition \(\varSigma =\varXi +\varDelta \) where \(\varXi \) is the part of the signature for application and \(\varDelta \) is the part of the signature for the remaining term constructors, we separately prove \(\overline{\varXi } (\square {\Downarrow }) \le \iota {}^\star [\square {\Downarrow }]\) and \(\overline{\varDelta } (\square {\Downarrow }) \le \iota {}^\star [\square {\Downarrow }]\). It suffices to come up with \(\square {\Downarrow }\)-relative invariants \(A,C\subseteq {\Downarrow }\) such that \(\overline{\varXi }(\square {\Downarrow })\le \iota {}^\star [A]\) and \(\overline{\varDelta }(\square {\Downarrow }) \le \iota {}^\star [C]\). Then by (18) we can conclude \(A,C\subseteq \square {\Downarrow }\), so

$$ \overline{\varXi } (\square {\Downarrow }) \le \iota {}^\star [A]\le \iota {}^\star [\square {\Downarrow }]\qquad \text {and}\qquad \overline{\varDelta } (\square {\Downarrow }) \le \iota {}^\star [C]\le \iota {}^\star [\square {\Downarrow }]. $$

Let us record for further reference what it means for \(Q\overset{}{\rightarrowtail } \textsf{Tr}\) to be a \(\square {\Downarrow }\)-relative invariant contained in \(\Downarrow \). Given \(t\in Q_\tau \), the following must hold:

figure o

We first put \(A = \square {\Downarrow }\vee (\iota \cdot {\mathop {\textsf{inl}}}){}_\star [\overline{\varXi }\square {\Downarrow }]\), and prove (1)–(3) for \(Q=A\). So let \(t\in A_\tau \); we distinguish cases on the disjunction defining A. If \(\square {\Downarrow }_\tau \, t\), then (1)–(3) follow easily by definition. Otherwise, we have \(t=p\, q\) such that \(\square {\Downarrow }_{\tau _{1} \rightarrowtriangle \tau _{2}}p\) and \(\square {\Downarrow }_{\tau _1}q\).

  1. (1)

    By definition, \(\square {\Downarrow }_{\tau _{1} \rightarrowtriangle \tau _{2}}p\) and \(\square {\Downarrow }_{\tau _1}q\) entail that for a (unique) term \(p'\), and that \(\square {\Downarrow }_{\tau _2}p'\), hence \(\Downarrow _{\tau _2} p'\). Since \(p \,q\mathrel {\Rightarrow }p'\), it follows that \(\Downarrow _{\tau _2}p\, q\).

  2. (2)

    We distinguish cases over the semantic rules for application:

    1. (a)

      \(p \,q \rightarrow p' \,q\) where \(p \rightarrow p'\). Then \(\square {\Downarrow }_{\tau _{1} \rightarrowtriangle \tau _{2}}p'\), hence \(A_{\tau _{2}}(p' \,q)\).

    2. (b)

      \(p \,q \rightarrow p'\) where . Since \(\square {\Downarrow }_{\tau _{1} \rightarrowtriangle \tau _{2}} p\) and \(\square {\Downarrow }_{\tau _1}q\), we have \(\square {\Downarrow }_{\tau _2}p'\), so \(A_{\tau _2}(p')\).

  3. (3)

    t does not have labelled transitions, hence this case is void.

Next, we show that \(C = \square {\Downarrow }\vee (\iota \cdot {\mathop {\textsf{inr}}}){}_\star [\overline{\varDelta }(\square {\Downarrow })]\) is a \(\square {\Downarrow }\)-relative invariant. We consider two representative cases; the remaining cases are handled similarly.

  • Case \(I_{\tau } :\tau \rightarrowtriangle \tau \). Since I terminates immediately, property (1) holds by definition of \(\Downarrow \) and (2) holds vacuously. For (3), if and \(\square {\Downarrow }_\tau s\), then \(t'=s\in \square {\Downarrow }_\tau \subseteq C_\tau \).

  • Case \(S''_{\tau _{1},\tau _{2},\tau _{3}}(t,s) :\tau _{1} \rightarrowtriangle \tau _{3}\) with \(\square {\Downarrow }_{\tau _{1} \rightarrowtriangle \tau _{2} \rightarrowtriangle \tau _{3}} t\) and \(\square {\Downarrow }_{\tau _{1} \rightarrowtriangle \tau _{2}} s\). Again, (1) holds because \(S''(t,s)\) terminates immediately, and (2) holds vacuously. For (3), suppose that \(\square {\Downarrow }_{\tau _1}r\); we have to show \((t\,r)\,(s\,r)\in C_{\tau _{3}}\). This follows from the inequality \(\overline{\varXi }(\square {\Downarrow }) \le \iota {}_\star [\square {\Downarrow }]\) shown above, because \(\square {\Downarrow }_{\tau _{2} \rightarrowtriangle \tau _{3}}(t \,r)\), \(\square {\Downarrow }_{\tau _2}(s \,r)\) by definition of \(\square {\Downarrow }\).

Note that the definition of \(\square {\Downarrow }\) uses both induction (over the structure of types) and coinduction (by taking at every type the greatest predicate satisfying some property).

Example 16

We give an alternative logical predicate defined purely inductively. It resembles Plotkin’s original concept of logical relation [55]. We define by

(19)

It is evidently logical for the restriction of the weak operational model to labelled transitions, given by if \(t\mathrel {\Rightarrow }t'\) and \(\gamma (t')\in D(\textsf{Tr},\textsf{Tr})\), and otherwise. A proof of strong normalization using is given in [25, App. A].

Example 17

A more popular (cf. [57, 58]) and subtly different variant of for proving strong normalization goes back to Tait [59]. We define \(\textrm{SN}\rightarrowtail \textsf{Tr}\) by

$$\begin{aligned} \begin{aligned} \textrm{SN}_\textsf{unit}\, (t) \iff &\; {\Downarrow }_\textsf{unit}\, (t) \\ \textrm{SN}_{\tau _{1} \rightarrowtriangle \tau _{2}}\,(t)\iff &\; {\Downarrow }_{\tau _{1} \rightarrowtriangle \tau _{2}}\,(t) \wedge (\forall s :\tau _{1}.\, \textrm{SN}_{\tau _{1}}(s) \implies \textrm{SN}_{\tau _{2}}(t \,s)) \end{aligned} \end{aligned}$$
(20)

Unlike , it is not immediate that \(\textrm{SN}\) is logical for (see [25, App. A]). For a proof of strong normalization based on \(\textrm{SN}\) in the context of the \(\lambda \)-calculus, see [57, Sec. 2].

While all three logical predicates \(\square {\Downarrow }\), , \(\textrm{SN}\) are eligible for proving strong normalization, with proofs of similar length and complexity, the predicate \(\square {\Downarrow }\) arguably has the most generic flavour, as it depends neither on a system-specific notion of weak transition (which appears in the definition of ) nor on the syntax of the language (such as the application operator appearing in the definition of \(\textrm{SN}\)). Thus, our abstract categorical approach to logical predicates will be based on a generalization of \(\square {\Downarrow }\).

3.3 Constructing Logical Predicates

Our abstract coalgebraic notion of logical predicate (Definition 12) is parametric in the bifunctor B and its lifting \(\overline{B}\) and decoupled from any specific syntax. Next, we develop a systematic construction that promotes a predicate P to a logical predicate, specifically to a locally maximal refinement of P, generalizing \(\square {\Downarrow }\) in Example 15. The construction proceeds in two stages. First, we fix the contravariant argument of the lifted bifunctor \(\overline{B}\) and construct a greatest coalgebraic invariant w.r.t. the resulting endofunctor [36, §6.3]:

Definition 18

(Relative henceforth). Let \(c :Y \rightarrow B(X,Y)\) and let \(S \overset{}{\rightarrowtail } X\) be a predicate. The (S-)relative henceforth modality sends \(P \overset{}{\rightarrowtail } Y\) to \(\square ^{\overline{B},c} (S, P) \overset{}{\rightarrowtail } Y\), which is the supremum in \({\textbf {Pred}}_{Y}({\mathcal {C}})\) of all S-relative invariants contained in P:

$$\begin{aligned} \square ^{\overline{B},c} (S, P) = \bigvee \{ Q \le P \mid Q \text { is an }S\text {-relative }\overline{B}\text {-invariant for }c \}. \end{aligned}$$
(21)

We will omit the superscripts \(\overline{B},c\) when they are irrelevant or clear from the context.

Proposition 19

The predicate \(\square (S, P)\) is the greatest S-relative \(\overline{B}\)-invariant contained in P. Moreover, the map \((S,P)\mapsto \square (S, P)\) is antitone in S and monotone in P.

Proof

The first statement follows from the Knaster-Tarski theorem since \(\square (S, P)\) is the greatest fixed point \( \square (S, P) = \nu G.\ P \wedge c{}^\star [\overline{B}(S,G)] \) in the complete lattice \({\textbf {Pred}}_{Y}({\mathcal {C}})\). The second statement holds due to the mixed variance of the predicate lifting \(\overline{B}\).    \(\square \)

The relative henceforth modality only yields relative invariants. To obtain a logical predicate, i.e. an invariant relative to itself, we move to the second stage of our construction, which is based on ultrametric semantics, see e.g. [9]. Let us briefly recall some terminology. A metric space \((X,\, d:X\times X\rightarrow \mathbb {R})\) is 1-bounded if \(d(x,y)\le 1\) for all xy, an ultrametric space if \(d(x,y)\le \max \{d(x,z),d(z,y)\}\) for all xyz, and complete if every Cauchy sequence converges. A map \(f:(X,d)\rightarrow (X',d')\) between metric spaces is nonexpansive if \(d'(f(x),f(y))\le d(x,y)\) for all xy, and contractive if there exists \(c\in [0,1)\), called a contraction factor, such that \(d'(f(x),f(y))\le c\cdot d(x,y)\) for all xy. A family of maps \((f_i:X\rightarrow X')_{i\in I}\) is uniformly contractive if there exists \(c\in [0,1)\) such that each \(f_i\) is contractive with factor c. By Banach’s fixed point theorem, every contractive endomap \(f:X\rightarrow X\) on a non-empty complete metric space has a unique fixed point.

Definition 20

The category \({\mathcal {C}}\) is predicate-contractive if

  1. (1)

    every \({\textbf {Pred}}_{X}({\mathcal {C}})\) carries the structure of a complete 1-bounded ultrametric space;

  2. (2)

    for every \(f:X\rightarrow Y\) in \({\mathcal {C}}\), the map \(f{}^\star [-]:{\textbf {Pred}}_{Y}({\mathcal {C}})\rightarrow {\textbf {Pred}}_{X}({\mathcal {C}})\) is non-expansive;

  3. (3)

    for any two co-well-ordered families \((P^i\overset{}{\rightarrowtail } X)_{i\in I}\) and \((Q^i\overset{}{\rightarrowtail } X)_{i\in I}\) of predicates,

    $$\begin{aligned}\textstyle d\bigl (\bigwedge _{i\in I} P^i,\bigwedge _{i\in I} Q^i\bigr )\le \sup _{i\in I} d(P^i,Q^i). \end{aligned}$$

    Here \((P^i\overset{}{\rightarrowtail } X)_{i\in I}\) is co-well-ordered if each nonempty subfamily has a greatest element.

Example 21

The category \({\mathcal {C}}={\textbf {Set}}^\textsf{Ty}\) is predicate-contractive when equipped with the ultrametric on \({\textbf {Pred}}_{X}({\mathcal {C}})\) given by \(d(P,Q)=2^{-n}\) for \(P,Q\rightarrowtail X\), where \(n=\inf \{\sharp \tau \mid P_\tau \ne Q_\tau \}\) and \(\sharp \tau \) is the size of \(\tau \), defined by \(\sharp \textsf{unit}=1\) and \(\sharp (\tau _1 \rightarrowtriangle \tau _2)=\sharp \tau _1+\sharp \tau _2\). By convention, \(\inf \,\emptyset =\infty \) and \(2^{-\infty }=0\). To see predicate-contractivity, first note that a function \(\mathcal {F}:{\textbf {Pred}}_{Y}({\mathcal {C}})\rightarrow {\textbf {Pred}}_{X}({\mathcal {C}})\) is non-expansive iff

$$\begin{aligned} \inf \{\sharp \tau \mid (\mathcal {F} P)_\tau \ne (\mathcal {F} Q)_\tau \} \ge \inf \{\sharp \tau \mid P_\tau \ne Q_\tau \} \qquad \text {for all }P, Q\rightarrowtail Y, \end{aligned}$$

and contractive (necessarily with factor at most 1/2) iff that inequality holds strictly.

This immediately implies clause (2) of Definition 20: inverse images in \({\textbf {Set}}^\textsf{Ty}\) are computed pointwise, and \(f_\tau {}^\star [P_\tau ] \ne f_\tau {}^\star [Q_\tau ]\) implies \(P_\tau \ne Q_\tau \) for \(f:X\rightarrow Y\) and \(P,Q\rightarrowtail Y\). Similarly, since intersections are computed pointwise, clause (3) amounts to

$$\begin{aligned} \inf \Bigl \{\sharp \tau \mid \bigcap _{i\in I}P^i_\tau \ne \bigcap _{i\in I}Q^i_\tau \Bigr \} \ge &\;\inf \{\sharp \tau \mid \exists i\in I: P^i_\tau \ne Q_\tau ^i\}, \end{aligned}$$

which is clearly true, for if \(\bigcap _{i\in I}P^i_\tau \ne \bigcap _{i\in I}Q^i_\tau \) then \(P^i_\tau \ne Q^i_\tau \) for some \(i\in I\).

Definition 22

(Contractive lifting). Suppose that \({\mathcal {C}}\) is predicate-contractive. The predicate lifting \(\overline{B}:{\textbf {Pred}}_{}({\mathcal {C}})^\textsf{op}\times {\textbf {Pred}}_{}({\mathcal {C}})\rightarrow {\textbf {Pred}}_{}({\mathcal {C}})\) is contractive if for every \(S\overset{}{\rightarrowtail } X\) the map \(\overline{B}(S,-)\) is non-expansive, and the family \((\overline{B}(-,P))_{P\overset{}{\rightarrowtail } X}\) is uniformly contractive.

Proposition 23

Let \(\overline{B}\) be contractive and \(c:X\rightarrow B(X,X)\). For every \(S\overset{}{\rightarrowtail } X\), the map \(\square ^{\overline{B},c}(S,-)\) is non-expansive, and the family \((\square ^{\overline{B},c}(-,P))_{P\overset{}{\rightarrowtail } X}\) is uniformly contractive.

Contractive liftings allow us to augment every predicate P to a logical predicate:

Definition 24

(Henceforth). Let \(\overline{B}\) be contractive and \(c:X\rightarrow B(X,X)\). For each predicate \(P\rightarrowtail X\) we define \(\square ^{\overline{B},c} P \overset{}{\rightarrowtail } X\) (where we usually omit the superscripts) to be the unique fixed point of the contractive endomap

$$\begin{aligned} S\mapsto \square ^{\overline{B},c} (S, P) \quad \text {on}\quad {\textbf {Pred}}_{X}({\mathcal {C}}). \end{aligned}$$
(22)

Theorem 25

The predicate \(\square {P}\) is the unique locally maximal logical refinement of P.

Proof

By (22), \(\square {P}\) is the unique predicate satisfying \(\square {P}=\square (\square {P},P)\). By (21), this equality says that \(\square {P}\) is the greatest \(\square {P}\)-relative invariant contained in P, as needed.    \(\square \)

Example 26

Let B be the behaviour bifunctor on \({\textbf {Set}}^\textsf{Ty}\) given by (3). Its canonical lifting \(\overline{B}\) (Example 9) is contractive because \(\overline{B}_{\tau _1 \rightarrowtriangle \tau _2}(P,Q)\) depends only on \(P_{\tau _1}\), \(Q_{\tau _2}\), \(Q_{\tau _1 \rightarrowtriangle \tau _2}\); in other words, \(\overline{B}\) decreases the size of types in the contravariant argument and does not increase it in the covariant argument. Given a coalgebra \(c :X \rightarrow B(X,X)\) and \({P \overset{}{\rightarrowtail } X}\), the fixed point \(\square ^{\overline{B},c} P\) is given by the \(\textsf{Ty}\)-indexed family of greatest fixed points

$$\begin{aligned} \begin{aligned} \square P_{\textsf{unit}} = &\; \nu G.\, P_{\textsf{unit}} \wedge c_{\textsf{unit}}{}^\star [G + 1], \\ \square P_{\tau _{1} \rightarrowtriangle \tau _{2}} = &\; \nu G.\, P_{\tau _{1} \rightarrowtriangle \tau _{2}} \wedge c_{\tau _{1} \rightarrowtriangle \tau _{2}}{}^\star [G + \{f:{\textsf{Tr}_{\tau _{1}}}\rightarrow {\textsf{Tr}_{\tau _{2}}}\mid \forall s \in \square {P}_{\tau _{1}}.\, f (s)\in \square {P}_{\tau _{2}} \}]. \end{aligned} \end{aligned}$$

This follows from Theorem 25 since the above predicate is clearly a locally maximal refinement of P. By instantiating c to the operational model \(\gamma :{\mu \varSigma }\rightarrow B({\mu \varSigma },{\mu \varSigma })\) of \({\textbf {xTCL}} \) and taking \(P={\Downarrow }\), we recover the definition of \(\square {\Downarrow }\) in Example 15.

Example 27

The logical predicate of Example 16 is precisely \(\square {\Downarrow }\) for \(\mathcal {P}D\) w.r.t. its canonical lifting and the coalgebra . More generally, for a coalgebra \(c :X \rightarrow \mathcal {P}D(X,X)\), the predicate \(\square P\) is inductively defined as follows:

$$\begin{aligned} \begin{aligned} \square P_{\textsf{unit}} =&\; P_{\textsf{unit}}, \\ \square P_{\tau _{1} \rightarrowtriangle \tau _{2}} =&\; P_{\tau _{1} \rightarrowtriangle \tau _{2}} \wedge c_{\tau _{1} \rightarrowtriangle \tau _{2}}{}^\star [\{F\subseteq X_{\tau _{2}}^{X_{\tau _{1}}}\mid \forall f\in F.\, s\in \square P_{\tau _{1}}\implies f (s)\in \square P_{\tau _{2}} \}]. \end{aligned} \end{aligned}$$

Remark 28

The construction of logical predicates for typed languages is enabled by the “type-decreasing” nature of the associated behaviour bifunctors. In untyped settings, e.g. for \(B(X,Y) = Y+Y^{X}\) on \({\textbf {Set}}\) modelling untyped combinatory logic [24], the canonical lifting \(\overline{B}\) is not contractive, hence the fixed point \(\square {P}\) in general fails to exist.

Remark 29

The forgetful functor \(|-|:{\textbf {Pred}}_{}({\mathcal {C}})\rightarrow {\mathcal {C}}\) forms a complete lattice fibration [35], equivalently a topological functor [2], and all notions and results of the present subsection extend to that level of generality. We leave the details for future work, as our reasoning techniques found in the upcoming sections are tailored to logical predicates.

We are now in a position to state precisely what a proof via logical predicates is in our framework. Given the operational model \(\gamma :{\mu \varSigma }\rightarrow B({\mu \varSigma },{\mu \varSigma })\) of a higher-order language, a predicate lifting \(\overline{B}\), and a target predicate \(P \overset{}{\rightarrowtail } {\mu \varSigma }\), a proof of P via logical predicates is a proof that \(\square P\) forms a subalgebra of the initial algebra \({\mu \varSigma }\), which means

$$\begin{aligned} \overline{\varSigma } (\square P) \le \iota {}^\star [\square P], \quad \text { equivalently}\quad \iota {}_\star [\overline{\varSigma } (\square P)] \le \square P. \end{aligned}$$
(23)

Then \(\square {P}={\mu \varSigma }\) by structural induction, whence \(P={\mu \varSigma }\) because \(\square {P}\le P\).

Up to this point, we have streamlined and formalized coalgebraic logical predicates as a certain abstract construction on predicates (Definition 24) and presented proofs by coalgebraic logical predicates as standard structural induction on said construction. This presentation is indeed that of an abstract method: the various parts of the problem setting, namely the syntax, the behaviour and its predicate lifting, as well as the operational semantics, are all parameters. In the next section, we exploit the parametric and generic nature of this method in two main ways. First, we present up-to techniques that simplify the proof goal (23) as much as possible. Second, we look to instantiate our method to problems on classes of higher-order languages, as opposed to reasoning about operational models of individual languages such as xTCL or the \(\lambda \)-calculus.

4 Logical Predicates and Higher-Order Abstract GSOS

As indicated before, substantial parts of the proof of strong normalization in Example 15 look generic. Specifically, the properties (2) and (3) established for \(Q=A\) and \(Q=C\) are independent of the choice of predicate \(P = {\Downarrow }\) in \(\square P\). Moreover, these steps are either obvious or follow immediately from the operational rules of \({\textbf {xTCL}} \): the predicates A and C being invariants can be attributed to the fact that except for terms of the form \(S''(-,-)\), all terms evolve either to a variable or to some flat term such as \(p' \,q\). The core of the proof, which is tailored to the choice of P, lies in proving property (1).

As it turns out, for a class of higher-order GSOS laws that we call relatively flat higher-order GSOS laws, conditions (2) and (3) are automatic. This insight leads us to a powerful up-to technique that simplifies proofs via logical predicates.

4.1 Relatively Flat Higher-Order GSOS Laws

The following definition abstracts the restricted nature of the rules of \({\textbf {xTCL}} \) to the level of higher-order GSOS laws. For simplicity, we confine ourselves to 0-pointed laws, however all the results of this subsection easily extend to the V-pointed case.

Definition 30

Let \(\varSigma :{\mathcal {C}}\rightarrow {\mathcal {C}}\) be a syntax functor of the form \(\varSigma = \coprod _{j \in J} \varSigma _{j}\), where \((J,\prec )\) is a non-empty well-founded strict partial order, and put \(\varSigma _{\prec k} = \coprod _{j \prec k}\varSigma _{j}\). A relatively flat (0-pointed) higher-order GSOS law of \(\varSigma \) over B is a J-indexed family of morphisms

$$\begin{aligned} \varrho ^{j}_{X,Y} :\varSigma _{j} (X \times B(X,Y))\rightarrow B(X, \varSigma ^{\star }_{\prec j} (X+Y) + \varSigma _{j}\varSigma ^{\star }_{\prec j} (X+Y)) \end{aligned}$$
(24)

dinatural in \(X\in {\mathcal {C}}\) and natural in \(Y\in {\mathcal {C}}\).

We put \(e_{j,X} = [{\mathop {\textsf{in}}}^{\sharp }_{\prec j}, \iota \cdot {\mathop {\textsf{in}}}_j \cdot \varSigma _{j}({\mathop {\textsf{in}}}^{\sharp }_{\prec j})]:\varSigma ^{\star }_{\prec j} X + \varSigma _{j}\varSigma ^{\star }_{\prec j} X\rightarrow \varSigma ^{\star } X\) where \({\mathop {\textsf{in}}}_{\prec j}:\varSigma _{\prec j} \rightarrow \varSigma \) and \({\mathop {\textsf{in}}}_{j}:\varSigma _{j} \rightarrow \varSigma \) are the coproduct injections, with free extensions \({\mathop {\textsf{in}}}^{\sharp }_{\prec j} :\varSigma ^{\star }_{\prec j} \rightarrow \varSigma ^{\star }\) and \({\mathop {\textsf{in}}}_{j}^\sharp :\varSigma _{j}^\star \rightarrow \varSigma ^\star \). Every relatively flat higher-order GSOS law (24) determines an ordinary higher-order GSOS law of \(\varSigma \) over B with components

figure ae

When we interpret a higher-order GSOS law as a set of operational rules, relative flatness means that the operations of the language can be ranked in a way that every term \({\mathop {\textsf{f}}}(-,\cdots ,-)\) with \({\mathop {\textsf{f}}}\) of rank j evolves into a term that uses only operations of strictly lower rank, except possibly its head symbol which may have the same rank j.

Example 31

xTCL is relatively flat: put \(J = \{0\prec 1\}\), let \(\varSigma _0\) contain application, and let \(\varSigma _1\) contain all other operation symbols. This is immediate from the rules in Figure 1.

Definition 32

Suppose that each \(\varSigma _j\) preserves strong epimorphisms. A predicate lifting of (24) is a relatively flat 0-pointed higher-order GSOS law \((\overline{\varrho }^j)_{j\in J}\) of \(\overline{\varSigma }=\coprod _j \overline{\varSigma _j}\) over \(\overline{B}\) where for every \(P\rightarrowtail X\) and \(Q\rightarrowtail Y\) the \({\textbf {Pred}}_{}({\mathcal {C}})\)-morphism \(\overline{\varrho }^{j}_{P,Q}\) is carried by \(\varrho ^j_{X,Y}\).

Remark 33

  1. (1)

    The condition on \(\varSigma _j\) ensures \(\overline{\varSigma _j}^\star =\overline{\varSigma _j^\star }\) (Proposition 7), so that the first component of \(\overline{\varrho }^{j}_{P,Q}\) has type \(\varSigma _{j} (X \times B(X,Y))\rightarrow B(X, \varSigma ^{\star }_{\prec j} (X+Y) + \varSigma _{j}\varSigma ^{\star }_{\prec j} (X+Y))\).

  2. (2)

    Liftings are unique if they exist: since \(\overline{\varrho }^{j}_{P,Q}\) is a \({\textbf {Pred}}_{}({\mathcal {C}})\)-morphism, it is determined by its first component \(\varrho ^j_{X,Y}\). Moreover, the (di)naturality of \(\overline{\varrho }^{j}\) follows from that of \(\varrho ^j\).

  3. (3)

    For the canonical lifting \(\overline{B}\), a lifting \((\overline{\varrho }^{j})_{j\in J}\) of \((\varrho ^j)_{j\in J}\) always exists [25, App. D].

The following theorem establishes a sound up-to technique for logical predicates. It states that for operational models of relatively flat laws, the proof goal (23) can be established by checking a substantially relaxed property.

Theorem 34

(Induction up to \(\square \)). Let \(\gamma :{\mu \varSigma }\rightarrow B({\mu \varSigma },{\mu \varSigma })\) be the operational model of a relatively flat 0-pointed higher-order GSOS law that admits a predicate lifting. Then for every predicate \(P \overset{}{\rightarrowtail } {\mu \varSigma }\) and every locally maximal logical refinement \(\square ^{\gamma ,\overline{B}}P\),

$$ \overline{\varSigma } (\square ^{\gamma ,\overline{B}} P) \le \iota {}^\star [P] \qquad \text {implies} \quad \overline{\varSigma } (\square ^{\gamma ,\overline{B}} P) \le \iota {}^\star [\square ^{\gamma ,\overline{B}} P] \quad \text {(hence }{P}={\mu \varSigma }\text {)}. $$

We stress that the theorem applies to any refinement \(\square ^{\gamma ,\overline{B}}P\) and does not assume a specific construction (e.g. that of Section 3.3). The up-to technique facilitates proofs via logical predicates quite dramatically. For illustration, we revisit strong normalization:

Example 35

We give an alternative proof of strong normalization of xTCL (cf. Example 15) via induction up to \(\square \). Hence it suffices to prove

$$ {\overline{\varSigma } (\square {\Downarrow })} \le \iota {}^\star [\Downarrow ], $$

which states that a term is terminating if all of its subterms are in the logical predicate \(\square {\Downarrow }\). This is clear for terms that are not applications, since they immediately terminate (cf. Figure 1). Now consider an application \(p \, q\) such that \( \square _{\tau _{1} \rightarrowtriangle \tau _{2}} {\Downarrow }(p)\) and \(\square _{\tau _{1}} {\Downarrow }(q)\). Since \(\square {\Downarrow }\) is a logical predicate contained in \(\Downarrow \), this entails that for a (unique) term \(p'\), and that \(\square {\Downarrow }_{\tau _2}p'\), hence \(\Downarrow _{\tau _2} p'\). Since \(p \,q\mathrel {\Rightarrow }p'\), it follows that \(\Downarrow _{\tau _2}p\, q\).

Analogous reasoning shows that xTCL is strongly normalizing under the call-by-value and the maximally nondeterministic evaluation strategy (Remark 5). In the latter case, strong normalization means that every term must eventually terminate, independently of the order of evaluation.

The reader should compare the above compact argument to the laborious original proof given in Example 15. Our up-to technique can be seen to precisely isolate the non-trivial core of the proof, while providing its generic parts for free. For a further application – type safety of the simply typed \(\lambda \)-calculus – see Section 4.2.

4.2 \(\lambda \)-Laws

We proceed to explain how our theory of logical predicates applies to languages with variables and binders. We highlight the core ideas and technical challenges in the case of the \(\lambda \)-calculus, and briefly sketch their categorical generalization; a full exposition can be found in [25, App. E]. Let STLC be the simply typed call-by-name \(\lambda \)-calculus with the set \(\textsf{Ty}\) of types given by (2) and operational rules

(25)

where \(s,t,t'\) range over \(\lambda \)-terms of appropriate type, and \([-/-]\) denotes capture-avoiding substitution. To model STLC in higher-order abstract GSOS, we follow ideas by Fiore [18]. Our base category \({\mathcal {C}}\) is the presheaf category \(({\textbf {Set}}^{{\mathbb {F}}/{\textsf{Ty}}})^{\textsf{Ty}}\) where \({\mathbb {F}}\) denotes the category of finite cardinals and functions, and the set \(\textsf{Ty}\) is regarded as a discrete category. An object \(\varGamma :n\rightarrow \textsf{Ty}\) of \({\mathbb {F}}/\textsf{Ty}\) is a typed context, associating to each variable \(x\in n\) a type; we put \(|{\varGamma }|:=n\) . A presheaf \(X\in ({\textbf {Set}}^{{\mathbb {F}}/{\textsf{Ty}}})^{\textsf{Ty}}\) associates to each context \(\varGamma \) and each type \(\tau \) a set \(X_{\tau }(\varGamma )\) whose elements we think of as terms of type \(\tau \) in context \(\varGamma \).

The syntax of \({\textbf {STLC}} \) is captured by the functor \(\varSigma :({\textbf {Set}}^{{\mathbb {F}}/{\textsf{Ty}}})^{\textsf{Ty}}\rightarrow ({\textbf {Set}}^{{\mathbb {F}}/{\textsf{Ty}}})^{\textsf{Ty}}\) where

$$\begin{aligned} \begin{aligned} \varSigma _{\textsf{unit}}X = &\; V_{\textsf{unit}} + K_{1} + \coprod _{\tau \in \textsf{Ty}}X_{\tau \rightarrowtriangle \textsf{unit}} \times X_{\tau },\\ \varSigma _{\tau _{1} \rightarrowtriangle \tau _{2}} X= &\; V_{\tau _{1} \rightarrowtriangle \tau _{2}} + \delta ^{\tau _{1}}_{\tau _{2}} X + \coprod _{\tau \in \textsf{Ty}}X_{\tau \rightarrowtriangle \tau _{1} \rightarrowtriangle \tau _{2}} \times X_{\tau }. \end{aligned} \end{aligned}$$
(26)

Here \(K_1\in {\textbf {Set}}^{{\mathbb {F}}/{\textsf{Ty}}}\) is the constant presheaf on 1, V is given by \(V_{\tau }(\varGamma )=\{x\in |\varGamma |\mid \varGamma (x)=\tau \}\), and \(\delta \) by \((\delta ^{\tau _1}_{\tau _2}X)(\varGamma ) = X_{\tau _2}(\varGamma + \check{\tau }_1)\) with \((-) + \check{\tau }_1\) denoting context extension by a variable of type \(\tau _1\). Informally, \(K_1\), V and \(\delta \) represent the constant \(\textsf{e}:\textsf{unit}\), variables, and \(\lambda \)-abstraction, respectively. The initial algebra for \(\varSigma \) is the presheaf \(\varLambda \) of \(\lambda \)-terms, i.e. \(\varLambda _\tau (\varGamma )\) is the set of \(\lambda \)-terms (modulo \(\alpha \)-equivalence) of type \(\tau \) in context \(\varGamma \) [18].

The behaviour bifunctor \(B^{\lambda } :(({\textbf {Set}}^{{\mathbb {F}}/{\textsf{Ty}}})^{\textsf{Ty}})^\textsf{op}\times ({\textbf {Set}}^{{\mathbb {F}}/{\textsf{Ty}}})^{\textsf{Ty}}\rightarrow ({\textbf {Set}}^{{\mathbb {F}}/{\textsf{Ty}}})^{\textsf{Ty}}\) for \({\textbf {STLC}} \) has two separate components: it is given by a product

(27)

and \(Y_{\tau _{2}}^{X_{\tau _{1}}}\) is an exponential object in \({\textbf {Set}}^{{\mathbb {F}}/\textsf{Ty}}\). The bifunctor models an abstract substitution structure; for instance, every \(\lambda \)-term \(t\in \varLambda _\tau (\varGamma )\) induces a natural transformation \(\prod _{x\in |\varGamma |} \varLambda _{\varGamma (x)}\rightarrow \varLambda _\tau \) in mapping a tuple \((t_1,\ldots ,t_{|\varGamma |})\) to the term obtained by simultaneous substitution of the terms \(t_i\) for the variables of t. The summands of the bifunctor B abstract from the possible operational behaviour of \(\lambda \)-terms: a term may explicitly terminate, reduce, get stuck (e.g. if it is a variable), or act as a function.

The operational rules (25) of STLC can be encoded into a V-pointed higher-order GSOS law of \(\varSigma \) over \(B^{\lambda }\), similar to the untyped \(\lambda \)-calculus treated in earlier work [24]. The operational model is the coalgebra whose components \(\phi ,\gamma \) describe the substitution structure and the operational behaviour of \(\lambda \)-terms.

At this point, a key technical issue can be observed: the canonical predicate lifting is not contractive. Indeed, given \(P\rightarrowtail X\), \(Q\rightarrowtail Y\), the predicate consists of all natural transformations \(\prod _{x \in |\varGamma |}X_{\varGamma (x)}\rightarrow Y_{\tau }\) that restrict to \(\prod _{x \in |\varGamma |}P_{\varGamma (x)}\rightarrow Q_{\tau }\), and this expression depends on \(P_{\varGamma (x)}\) where the type \(\varGamma (x)\) may be of higher complexity than \(\tau \). In particular, we conclude that \(\overline{B^{\lambda }}\) is not contractive. In contrast, the canonical lifting \(\overline{B}\) is contractive and hence \(\square ^{\gamma ,\overline{B}} P\) exists for every \(P\rightarrowtail \varLambda \) (Definition 24). However, it is well-known that logical predicates do not do the trick for inductive proofs in the \(\lambda \)-calculus, see e.g. [57, p. 9] and [49, p. 150]; rather, one needs to prove the open extension of the logical predicate, which is the larger predicate

figure al

The standard proof method is then to show directly by structural induction. However, this can be greatly simplified by the following up-to-principle, which works with the original predicate \(\square ^{\gamma ,\overline{B}}{P}\) and forms a counterpart of Theorem 34 for the \(\lambda \)-calculus:

Theorem 36

(Induction up to ). Let \(P\rightarrowtail \varLambda \) be a predicate. Then

figure ao

Remark 37

Concretely, the theorem states that to prove \(P=\varLambda \), it suffices to prove that (1) variables satisfy P, (2) the unit expression \(\textsf{e} :\textsf{unit}\) satisfies P, (3) for all application terms \(p \, q\) such that \(\square _{\tau _{1} \rightarrowtriangle \tau _{2}} P(\varGamma )(p)\) and \(\square _{\tau _{1}} P(\varGamma )(q)\), we have \(P_{\tau _{2}}(\varGamma )(p \, q)\), and (4) for all \(\lambda \)-abstractions \(\lambda x :\tau _{1}.\,t\) such that \(t \in \square _{\tau _{2}}P(\varGamma ,x)\), we have \(P_{\tau _{1} \rightarrowtriangle \tau _{2}}(\varGamma )(\lambda x :\tau _{1}.\,t)\).

Example 38

We prove type safety for STLC via induction up to . Thus consider the predicate \(\textrm{Safe}\rightarrowtail \varLambda \) that is constantly true on open terms and given by

$$ t\in \textrm{Safe}_{\tau }(\varnothing ) \iff \big (\forall {e}.\,t \mathrel {\Rightarrow }e \implies (e\text { is not an application}) \vee \exists r.\, e \rightarrow r\big ), $$

on closed terms. We only need to check the conditions (1)–(4) of Remark 37. Conditions (1), (2), (4) are clear since variables are open terms and the term \(\textsf{e}:\textsf{unit}\) and \(\lambda \)-abstractions do not reduce. The only interesting clause is (3) for the empty context. Thus let \(p \, q\) be a closed application term with \(p \in \square {\textrm{Safe}}_{\tau _{1} \rightarrowtriangle \tau _{2}}(\varnothing )\) and \(q \in \square {\textrm{Safe}}_{\tau _{1}}(\varnothing )\); we need to show \(p \, q \in {\textrm{Safe}}_{\tau _{2}}(\varnothing )\). We proceed by case distinction on \(p \, q \mathrel {\Rightarrow }e\):

  1. (a)

    \(p \mathrel {\Rightarrow }p'\) and \(e = p' \, q\). Then \(p'\in \square {\textrm{Safe}}_{\tau _{1} \rightarrowtriangle \tau _{2}}(\varnothing )\) by invariance, in particular \(p'\) is safe, so \(p'\) is either not an application or reduces. In the former case, \(p'\) is necessarily a \(\lambda \)-abstraction since it is closed and not of type \(\textsf{unit}\). Thus, in both cases, e reduces.

  2. (b)

    \(p \mathrel {\Rightarrow }\lambda x.p'\) and \(p'[q/x] \mathrel {\Rightarrow }e\). Since \(\square \textrm{Safe}\) is a logical predicate, from \(p \in \square {\textrm{Safe}}_{\tau _{1} \rightarrowtriangle \tau _{2}}(\varnothing )\) and \(q \in \square _{\tau _{1}}\textrm{Safe}(\varnothing )\) we can deduce \(p'[q/x] \in \square _{\tau _{2}}\textrm{Safe}(\varnothing )\), whence \(e \in \square _{\tau _{2}}\textrm{Safe}(\varnothing )\). In particular, e is safe, which implies that e is either not an application or reduces.

As an exercise, we invite the reader to prove strong normalization of \({\textbf {STLC}} \) via induction up to . The reader should compare these short and simple proofs with more traditional ones, see e.g. [57].

All the above results and observations for STLC can be generalized and developed at the level of general higher-order abstract GSOS laws. To this end, we first abstract the behaviour functor (27) to a functor of the form , where is the internal hom-functor of a suitable closed monoidal structure on the base category \({\mathcal {C}}\). In the case of \({\textbf {STLC}} \), this structure is given by Fiore’s substitution tensor [18]. Second, we observe that the higher-order GSOS law of \({\textbf {STLC}} \) is an instance of a special kind of law that we coin relatively flat \(\lambda \)-laws. The induction-up-to- technique of Theorem 36 then can be shown to hold for operational models of relatively flat \(\lambda \)-laws. More details can be found in [25, App. E].

5 Strong Normalization for Deterministic Systems, Abstractly

The high level of generality in which the theory of logical predicates is developed above enables reasoning uniformly about whole families of languages and behaviours. In this section, we narrow our focus to deterministic systems and establish a general strong normalization criterion, which can be checked in concrete instances by mere inspection of the operational rules corresponding to higher-order abstract GSOS laws.

Throughout this section, we fix a 0-pointed higher-order GSOS law \(\varrho \) of a signature endofunctor \(\varSigma :{\mathcal {C}}\rightarrow {\mathcal {C}}\) over a behaviour bifunctor \(B:{\mathcal {C}}^\textsf{op}\times {\mathcal {C}}\rightarrow {\mathcal {C}}\), where

$$\begin{aligned} B(X,Y)=Y+D(X,Y) \quad \text {for some}\quad D:{\mathcal {C}}^\textsf{op}\times {\mathcal {C}}\rightarrow {\mathcal {C}}. \end{aligned}$$

For instance, the type functor (3) for \({\textbf {xTCL}} \) is of that form. The operational model \(\gamma :{\mu \varSigma }\rightarrow {\mu \varSigma }+D({\mu \varSigma },{\mu \varSigma })\) has an n-step extension \(\gamma ^{(n)}:{\mu \varSigma }\rightarrow {\mu \varSigma }+D({\mu \varSigma },{\mu \varSigma })\), for each \(n\in \mathbb {N}\), where \(\gamma ^{(0)}\) is the left coproduct injection and \(\gamma ^{(n+1)}\) is the composite

$$ {\mu \varSigma }\xrightarrow {\gamma }{\mu \varSigma }+D({\mu \varSigma },{\mu \varSigma }) \xrightarrow {\gamma ^{(n)}+{\textsf{id}}} {\mu \varSigma }+D({\mu \varSigma },{\mu \varSigma })+D({\mu \varSigma },{\mu \varSigma }) \xrightarrow {{\textsf{id}}+\nabla } {\mu \varSigma }+D({\mu \varSigma },{\mu \varSigma }).$$

We regard \(D({\mu \varSigma },{\mu \varSigma })\) as a predicate on \(B({\mu \varSigma },{\mu \varSigma })\) via the right coproduct injection, which is monic by extensivity of \({\mathcal {C}}\), and define the following predicates on \({\mu \varSigma }\):

$$ \Downarrow _n \,=\, (\gamma ^{(n)})^\star [D({\mu \varSigma },{\mu \varSigma })]\qquad \text {and}\qquad \Downarrow \,=\,\bigvee _{n} \Downarrow _n. $$

In xTCL, these are the predicates of strong normalization or strong normalization after at most n steps, resp. Accordingly, we define strong normalization abstractly as follows:

Definition 39

The higher-order GSOS law \(\varrho \) is strongly normalizing if \(\Downarrow \,={\mu \varSigma }\).

We next identify two natural conditions on the law \(\varrho \) that together ensure strong normalization. The first roughly asserts that for a term \(t={\mathop {\textsf{f}}}(x_1,\ldots ,x_n)\) whose variables \(x_i\) are non-progressing, the term t is either non-progressing or it progresses to a variable.

Definition 40

The higher-order GSOS law \(\varrho \) is simple if its components \(\varrho _{X,Y}\) restrict to morphisms \(\varrho _{X,Y}^0\) as in the diagram below, where \(\eta \) is the unit of the free monad \(\varSigma ^{\star }\):

figure au

The second condition asserts that the rules represented by the higher-order GSOS law remain sound when strong transitions are replaced by weak ones. In the following, the graph of a morphism \(f:A\rightarrow B\) is the image \(\textsf{gra}(f)\rightarrowtail A\times B\) of \(\langle {\textsf{id}},f\rangle :A\rightarrow A\times B\).

Definition 41

The higher-order GSOS law \(\varrho \) respects weak transitions if for every \(n\in \mathbb {N}\), the graph of the composite below is contained in \(\bigvee _k\, \textsf{gra}(\gamma ^{(k)}\cdot \iota )\).

$$ \varSigma ({\mu \varSigma })\xrightarrow {\varSigma \langle {\textsf{id}},\gamma ^{(n)}\rangle } \varSigma ({\mu \varSigma }\times B({\mu \varSigma },{\mu \varSigma })) \xrightarrow {\varrho _{{\mu \varSigma },{\mu \varSigma }}} B({\mu \varSigma },\varSigma ^{\star }({\mu \varSigma }+{\mu \varSigma })) \xrightarrow {B({\textsf{id}}, \hat{\iota }\cdot \varSigma ^{\star }\nabla )} B({\mu \varSigma },{\mu \varSigma }) $$

Note that the higher-order GSOS law for \({\textbf {xTCL}} \) is simple and respects weak transitions. Thus, strong normalization of \({\textbf {xTCL}} \) is an instance of the following strong normalization theorem for higher-order abstract GSOS. Concerning its conditions, an \(\omega \)-directed union is a colimit of an \(\omega \)-chain \(X_0\rightarrowtail X_1\rightarrowtail X_2\rightarrowtail \cdots \) of monics. We say that monos in \({\mathcal {C}}\) are \(\omega \)-smooth if any such colimit has monic injections, and moreover for every compatible cocone of monos, the mediating morphism is monic. This property holds in every locally finitely presentable category [3, Prop. 1.62], e.g. sets, posets, or presheaves.

Theorem 42

(Strong normalization). Suppose that the following conditions hold:

  1. (1)

    On top of Assumptions 6, \({\mathcal {C}}\) is countably extensive, and monos are \(\omega \)-smooth.

  2. (2)

    \(\varSigma \) preserves \(\omega \)-directed unions, and D preserves monos in the second component.

  3. (3)

    \(\varrho \) is relatively flat, simple, and respects weak transitions.

  4. (4)

    \(\Downarrow \) has a locally maximal logical refinement w.r.t. \(\gamma \) and the canonical lifting \(\overline{B}\).

Then the higher-order GSOS law \(\varrho \) is strongly normalizing.

Recall that condition (4) holds if \(\overline{B}\) is contractive (Theorem 25). The proof uses the induction-up-to-\(\square \) technique and a careful categorical abstraction of Example 35.

6 Conclusion and Future Work

Our work presents the initial steps towards a unifying, efficient theory of logical relations for higher-order languages based on higher-order abstract GSOS. This theory can be broadened in various directions. One obvious direction would be to extend our theory from predicates to relations. Binary logical relations are often utilized as sound (and sometimes complete) relations w.r.t. contextual equivalence. Additional generalizations are suggested by the large amount of existing work on logical relations. One important direction is to generalize the type system to cover, e.g., recursive types, parametric polymorphism, or dependent types. Supporting recursive types will presumably require an adaptation of the method of step-indexing [17] to our abstract setting. Another point of interest is to apply and extend our framework to effectful (e.g. probabilistic) settings [40, 54], including e.g. an effectful version of the criterion of Section 5.

As indicated in Remark 29, large parts of our development in Section 3 can be reformulated in fibrational terms. This has the potential merit of enabling abstract reasoning about higher-order programs in metric and differential settings as done in previous work on fine-grain call-by-value [13, 14]. In future work, we aim to develop such a generalization, and to explore the connection between our weak transition semantics and the general evaluation semantics used in op. cit.