Consistent Subtyping for All
Abstract
Consistent subtyping is employed in some gradual type systems to validate type conversions. The original definition by Siek and Taha serves as a guideline for designing gradual type systems with subtyping. Polymorphic types à la System F also induce a subtyping relation that relates polymorphic types to their instantiations. However Siek and Taha’s definition is not adequate for polymorphic subtyping. The first goal of this paper is to propose a generalization of consistent subtyping that is adequate for polymorphic subtyping, and subsumes the original definition by Siek and Taha. The new definition of consistent subtyping provides novel insights with respect to previous polymorphic gradual type systems, which did not employ consistent subtyping. The second goal of this paper is to present a gradually typed calculus for implicit (higherrank) polymorphism that uses our new notion of consistent subtyping. We develop both declarative and (bidirectional) algorithmic versions for the type system. We prove that the new calculus satisfies all static aspects of the refined criteria for gradual typing, which are mechanically formalized using the Coq proof assistant.
1 Introduction
Gradual typing [21] is an increasingly popular topic in both programming language practice and theory. On the practical side there is a growing number of programming languages adopting gradual typing. Those languages include Clojure [6], Python [27], TypeScript [5], Hack [26], and the addition of Dynamic to C# [4], to cite a few. On the theoretical side, recent years have seen a large body of research that defines the foundations of gradual typing [8, 9, 13], explores their use for both functional and objectoriented programming [21, 22], as well as its applications to many other areas [3, 24].
A key concept in gradual type systems is consistency [21]. Consistency weakens type equality to allow for the presence of unknown types. In some gradual type systems with subtyping, consistency is combined with subtyping to give rise to the notion of consistent subtyping [22]. Consistent subtyping is employed by gradual type systems to validate type conversions arising from conventional subtyping. One nice feature of consistent subtyping is that it is derivable from the more primitive notions of consistency and subtyping. As Siek and Taha [22] put it this shows that “gradual typing and subtyping are orthogonal and can be combined in a principled fashion”. Thus consistent subtyping is often used as a guideline for designing gradual type systems with subtyping.
Unfortunately, as noted by Garcia et al. [13], notions of consistency and/or consistent subtyping “become more difficult to adapt as type systems get more complex”. In particular, for the case of type systems with subtyping, certain kinds of subtyping do not fit well with the original definition of consistent subtyping by Siek and Taha [22]. One important case where such mismatch happens is in type systems supporting implicit (higherrank) polymorphism [11, 18]. It is wellknown that polymorphic types à la System F induce a subtyping relation that relates polymorphic types to their instantiations [16, 17]. However Siek and Taha’s [22] definition is not adequate for this kind of subtyping. Moreover the current framework for Abstracting Gradual Typing (AGT) [13] also does not account for polymorphism, with the authors acknowledging that this is one of the interesting avenues for future work.
Existing work on gradual type systems with polymorphism does not use consistent subtyping. The Polymorphic Blame Calculus (\(\lambda \mathsf {B}\)) [1] is an explicitly polymorphic calculus with explicit casts, which is often used as a target language for gradual type systems with polymorphism. In \(\lambda \mathsf {B}\) a notion of compatibility is employed to validate conversions allowed by casts. Interestingly \(\lambda \mathsf {B}\) allows conversions from polymorphic types to their instantiations. For example, it is possible to cast a value with type \(\forall a{.} a \rightarrow a\) into \(\mathsf {Int}\rightarrow \mathsf {Int}\). Thus an important remark here is that while \(\lambda \mathsf {B}\) is explicitly polymorphic, casting and conversions are closer to implicit polymorphism. That is, in a conventional explicitly polymorphic calculus (such as System F), the primary notion is type equality, where instantiation is not taken into account. Thus the types \(\forall a{.} a \rightarrow a\) and \(\mathsf {Int}\rightarrow \mathsf {Int}\) are deemed incompatible. However in implicitly polymorphic calculi [11, 18] \(\forall a{.} a \rightarrow a\) and \(\mathsf {Int}\rightarrow \mathsf {Int}\) are deemed compatible, since the latter type is an instantiation of the former. Therefore \(\lambda \mathsf {B}\) is in a sense a hybrid between implicit and explicit polymorphism, utilizing type equality (à la System F) for validating applications, and compatibility for validating casts.
An alternative approach to polymorphism has recently been proposed by Igarashi et al. [14]. Like \(\lambda \mathsf {B}\) their calculus is explicitly polymorphic. However, in that work they employ type consistency to validate cast conversions, and forbid conversions from \(\forall a{.} a \rightarrow a\) to \(\mathsf {Int}\rightarrow \mathsf {Int}\). This makes their casts closer to explicit polymorphism, in contrast to \(\lambda \mathsf {B}\). Nonetheless, there is still same flavour of implicit polymorphism in their calculus when it comes to interactions between dynamically typed and polymorphically typed code. For example, in their calculus type consistency allows types such as \(\forall a{.} a \rightarrow \mathsf {Int}\) to be related to \(\star \rightarrow \mathsf {Int}\), where some sort of (implicit) polymorphic subtyping is involved.
Gradual typing and polymorphism are orthogonal and can be combined in a principled fashion. ^{1}
With the insights gained from our work, we argue that, for implicit polymorphism, Ahmed et al.’s [1] notion of compatibility is too permissive (i.e. too many programs are allowed to typecheck), and that Igarashi et al.’s [14] notion of type consistency is too conservative. As a step towards an algorithmic version of consistent subtyping, we present a syntaxdirected version of consistent subtyping that is sound and complete with respect to our formal definition of consistent subtyping. The syntaxdirected version of consistent subtyping is remarkably simple and wellbehaved, without the adhoc restriction operator [22]. Moreover, to further illustrate the generality of our consistent subtyping definition, we show that it can also account for top types, which cannot be dealt with by Siek and Taha’s [22] definition either.
 We define a framework for consistent subtyping with:

a new definition of consistent subtyping that subsumes and generalizes that of Siek and Taha [22], and can deal with polymorphism and top types.

a syntaxdirected version of consistent subtyping that is sound and complete with respect to our definition of consistent subtyping, but still guesses polymorphic instantiations.


Based on consistent subtyping, we present a declarative gradual type system with predicative implicit higherrank polymorphism. We prove that our calculus satisfies the static aspects of the refined criteria for gradual typing [25], and is typesafe by a typedirected translation to \(\lambda \mathsf {B}\), and thus hereditarily preserves parametricity [2].

We present a complete and sound bidirectional algorithm for implementing the declarative system based on the design principle of Garcia and Cimini [12] and the approach of Dunfield and Krishnaswami [11].

All of the metatheory of this paper, except some manual proofs for the algorithmic type system, has been mechanically formalized in Coq^{2}.
2 Background and Motivation
In this section we review a simple gradually typed language with objects [22], to introduce the concept of consistency subtyping. We also briefly talk about the OderskyLäufer type system for higherrank types [17], which serves as the original language on which our gradually typed calculus with implicit higherrank polymorphism is based.
2.1 Gradual Subtyping
Siek and Taha [22] developed a gradual typed system for objectoriented languages that they call \(\mathbf {FOb}^{?}_{<:}\). Central to gradual typing is the concept of consistency (written \(\sim \)) between gradual types, which are types that may involve the unknown type \(\star \). The intuition is that consistency relaxes the structure of a type system to tolerate unknown positions in a gradual type. They also defined the subtyping relation in a way that static type safety is preserved. Their key insight is that the unknown type \(\star \) is neutral to subtyping, with only \(\star <:\star \). Both relations are found in Fig. 1.
A primary contribution of their work is to show that consistency and subtyping are orthogonal. To compose subtyping and consistency, Siek and Taha [22] defined consistent subtyping (written \(\lesssim \)) in two equivalent ways:
Definition 1

\(A \lesssim B\) if and only if \(A \sim C\) and \(C <:B\) for some C.

\(A \lesssim B\) if and only if \(A <:C\) and \(C \sim B\) for some C.
2.2 The OderskyLäufer Type System
The calculus we are combining gradual typing with is the wellestablished predicative type system for higherrank types proposed by Odersky and Läufer [17]. One difference is that, for simplicity, we do not account for a let expression, as there is already existing work about gradual type systems with let expressions and let generalization (for example, see Garcia and Cimini [12]). Similar techniques can be applied to our calculus to enable let generalization.
The syntax of the type system, along with the typing and subtyping judgments is given in Fig. 2. An implicit assumption throughout the paper is that variables in contexts are distinct. We save the explanations for the static semantics to Sect. 4, where we present our gradually typed version of the calculus.
2.3 Motivation: Gradually Typed HigherRank Polymorphism
Our work combines implicit (higherrank) polymorphism with gradual typing. As is well known, a gradually typed language supports both fully static and fully dynamic checking of program properties, as well as the continuum between these two extremes. It also offers programmers finegrained control over the statictodynamic spectrum, i.e., a program can be evolved by introducing more or less precise types as needed [13].
This program is rejected by Haskell’s type checker because Haskell implements the DamasMilner rule that a lambdabound argument (such as Open image in new window can only have a monotype, i.e., the type checker can only assign Open image in new window the type Open image in new window , or Open image in new window , but not \(\forall a{.} [a] \rightarrow [a]\). Finding such manual polymorphic annotations can be nontrivial. Instead of rejecting the program outright, due to missing type annotations, gradual typing provides a simple alternative by giving x the unknown type (denoted \(\star \)). With such typing the same program typechecks and produces \(([2, 1], ['b', 'a'])\). By running the program, programmers can gain some additional insight about the runtime behaviour. Then, with such insight, they can also give x a more precise type (\(\forall a{.} [a] \rightarrow [a]\)) a posteriori so that the program continues to typecheck via implicit polymorphism and also grants more static safety. In this paper, we envision such a language that combines the benefits of both implicit higherrank polymorphism and gradual typing.
3 Revisiting Consistent Subtyping
The syntax of types is given at the top of Fig. 3. We write A, B for types. Types are either the integer type \(\mathsf {Int}\), type variables a, functions types \(A \rightarrow B\), universal quantification \(\forall a{.} A\), or the unknown type \(\star \). Though we only have one base type \(\mathsf {Int}\), we also use \(\mathsf {Bool}\) for the purpose of illustration. Note that monotypes \(\tau \) contain all types other than the universal quantifier and the unknown type \(\star \). We will discuss this restriction when we present the subtyping rules. Contexts \(\varPsi \) are ordered lists of type variable declarations and term variables.
3.1 Consistency and Subtyping
We start by giving the definitions of consistency and subtyping for polymorphic types, and comparing our definitions with the compatibility relation by Ahmed et al. [1] and type consistency by Igarashi et al. [14].
Consistency. The key observation here is that consistency is mostly a structural relation, except that the unknown type \(\star \) can be regarded as any type. Following this observation, we naturally extend the definition from Fig. 1 with polymorphic types, as shown at the middle of Fig. 3. In particular a polymorphic type \(\forall a{.} A\) is consistent with another polymorphic type \(\forall a{.} B\) if A is consistent with B.
Subtyping. We express the fact that one type is a polymorphic generalization of another by means of the subtyping judgment \(\varPsi \vdash A <:B\). Compared with the subtyping rules of Odersky and Läufer [17] in Fig. 2, the only addition is the neutral subtyping of \(\star \). Notice that, in the rule SForallL, the universal quantifier is only allowed to be instantiated with a monotype. The judgment \(\varPsi \vdash \tau \) checks all the type variables in \(\tau \) are bound in the context \(\varPsi \). For space reasons, we omit the definition. According to the syntax in Fig. 3, monotypes do not include the unknown type \(\star \). This is because if we were to allow the unknown type to be used for instantiation, we could have \( \forall a{.} a \rightarrow a <:\star \rightarrow \star \) by instantiating a with \(\star \). Since \(\star \rightarrow \star \) is consistent with any functions \(A \rightarrow B\), for instance, \(\mathsf {Int}\rightarrow \mathsf {Bool}\), this means that we could provide an expression of type \(\forall a{.} a \rightarrow a\) to a function where the input type is supposed to be \(\mathsf {Int}\rightarrow \mathsf {Bool}\). However, as we might expect, \(\forall a{.} a \rightarrow a\) is definitely not compatible with \(\mathsf {Int}\rightarrow \mathsf {Bool}\). This does not hold in any polymorphic type systems without gradual typing. So the gradual type system should not accept it either. (This is the socalled conservative extension property that will be made precise in Sect. 4.3.)
Compared with our consistency definition in Fig. 3, their first rule is the same as ours. The second rule says that a non \(\forall \)type can be consistent with a \(\forall \)type only if it contains \(\star \). In this way, their type system is able to reject \(\forall a{.} a \rightarrow a \sim \mathsf {Int}\rightarrow \mathsf {Bool}\). However, in order to keep conservativity, they also reject \(\forall a{.} a \rightarrow a \sim \mathsf {Int}\rightarrow \mathsf {Int}\), which is perfectly sensible in their setting (i.e., explicit polymorphism). However with implicit polymorphism, we would expect \(\forall a{.} a \rightarrow a\) to be related with \(\mathsf {Int}\rightarrow \mathsf {Int}\), since a can be instantiated to \(\mathsf {Int}\).
Nonetheless, when it comes to interactions between dynamically typed and polymorphically typed terms, both relations allow \(\forall a{.} a \rightarrow \mathsf {Int}\) to be related with \(\star \rightarrow \mathsf {Int}\) for example, which in our view, is some sort of (implicit) polymorphic subtyping combined with type consistency, and that should be derivable by the more primitive notions in the type system (instead of inventing new relations). One of our design principles is that subtyping and consistency is orthogonal, and can be naturally superimposed, echoing the same opinion of Siek and Taha [22].
3.2 Towards Consistent Subtyping
With the definitions of consistency and subtyping, the question now is how to compose these two relations so that two types can be compared in a way that takes these two relations into account.
Unfortunately, the original definition of Siek and Taha [22] (Definition 1) does not work well with our definitions of consistency and subtyping for polymorphic types. Consider two types: \((\forall a{.} a \rightarrow \mathsf {Int}) \rightarrow \mathsf {Int}\), and \((\star \rightarrow \mathsf {Int}) \rightarrow \mathsf {Int}\). The first type can only reach the second type in one way (first by applying consistency, then subtyping), but not the other way, as shown in Fig. 4a. We use \(\bot \) to mean that we cannot find such a type. Similarly, there are situations where the first type can only reach the second type by the other way (first applying subtyping, and then consistency), as shown in Fig. 4b.
What is worse, if those two examples are composed in a way that those types all appear covariantly, then the resulting types cannot reach each other in either way. For example, Fig. 4c shows such two types by putting a \(\mathsf {Bool}\) type in the middle, and neither definition of consistent subtyping works.
Observations on Consistent Subtyping Based on Information Propagation. In order to develop the correct definition of consistent subtyping for polymorphic types, we need to understand how consistent subtyping works. We first review two important properties of subtyping: (1) subtyping induces the subsumption rule: if \(A <:B\), then an expression of type A can be used where B is expected; (2) subtyping is transitive: if \(A <:B\), and \(B <:C\), then \(A <:C\). Though consistent subtyping takes the unknown type into consideration, the subsumption rule should also apply: if \(A \lesssim B\), then an expression of type A can also be used where B is expected, given that there might be some information lost by consistency. A crucial difference from subtyping is that consistent subtyping is not transitive because information can only be lost once (otherwise, any two types are a consistent subtype of each other). Now consider a situation where we have both \(A <:B\), and \(B \lesssim C\), this means that A can be used where B is expected, and B can be used where C is expected, with possibly some loss of information. In other words, we should expect that A can be used where C is expected, since there is at most onetime loss of information.
Observation 1
If \(A <:B\), and \(B \lesssim C\), then \(A \lesssim C\).
This is reflected in Fig. 5a. A symmetrical observation is given in Fig. 5b:
Observation 2
If \(C \lesssim B\), and \(B <:A\), then \(C \lesssim A\).
From the above observations, we see what the problem is with the original definition. In Fig. 5a, if B can reach C by \(T_1\), then by subtyping transitivity, A can reach C by \(T_1\). However, if B can only reach C by \(T_2\), then A cannot reach C through the original definition. A similar problem is shown in Fig. 5b.
However, it turns out that those two problems can be fixed using the same strategy: instead of taking onestep subtyping and onestep consistency, our definition of consistent subtyping allows types to take onestep subtyping, onestep consistency, and one more step subtyping. Specifically, \(A<:B \sim T_2 <:C\) (in Fig. 5a) and \(C<:T_1 \sim B <:A\) (in Fig. 5b) have the same relation chain: subtyping, consistency, and subtyping.
Definition 2
With Definition 2, Fig. 6 illustrates the correct relation chain for the broken example shown in Fig. 4c. At first sight, Definition 2 seems worse than the original: we need to guess two types! It turns out that Definition 2 is a generalization of Definition 1, and they are equivalent in the system of Siek and Taha [22]. However, more generally, Definition 2 is compatible with polymorphic types.
3.3 Abstracting Gradual Typing
Garcia et al. [13] presented a new foundation for gradual typing that they call the Abstracting Gradual Typing (AGT) approach. In the AGT approach, gradual types are interpreted as sets of static types, where static types refer to types containing no unknown types. In this interpretation, predicates and functions on static types can then be lifted to apply to gradual types. Central to their approach is the socalled concretization function. For simple types, a concretization \(\gamma \) from gradual types to a set of static types^{4} is defined as follows:
Definition 3
Based on the concretization function, subtyping between static types can be lifted to gradual types, resulting in the consistent subtyping relation:
Definition 4
(Consistent Subtyping in AGT). \(A ~\widetilde{<:}~B\) if and only if \(A_1 <:B_1\) for some \(A_1 \in \gamma (A)\), \(B_1 \in \gamma (B)\).
Later they proved that this definition of consistent subtyping coincides with that of Siek and Taha [22] (Definition 1). By Proposition 1, we can directly conclude that our definition coincides with AGT:
Proposition 2
(Equivalence to AGT on Simple Types). \(A \lesssim B\) iff \(A ~\widetilde{<:}~B\).
However, AGT does not show how to deal with polymorphism (e.g. the interpretation of type variables) yet. Still, as noted by Garcia et al. [13], it is a promising line of future work for AGT, and the question remains whether our definition would coincide with it.
Another note related to AGT is that the definition is later adopted by Castagna and Lanvin [7], where the static types \(A_1, B_1\) in Definition 4 can be algorithmically computed by also accounting for top and bottom types.
3.4 Directed Consistency
The value of their definition is that consistent subtyping is derived compositionally from static subtyping and precision. These are two more atomic relations. At first sight, their definition looks very similar to Definition 2 (replacing \(\sqsubseteq \) by \(<:\) and \(<:\) by \(\sim \)). Then a question arises as to which one is more fundamental. To answer this, we need to discuss the relation between consistency and precision.
Relating Consistency and Precision. Precision is a partial order (antisymmetric and transitive), while consistency is symmetric but not transitive. Nonetheless, precision and consistency are related by the following proposition:
Proposition 3

If \(A \sim B\), then there exists (static) C, such that \(A \sqsubseteq C\), and \(B \sqsubseteq C\).

If for some (static) C, we have \(A \sqsubseteq C\), and \(B \sqsubseteq C\), then we have \(A \sim B\).
It may seem that precision is a more atomic relation, since consistency can be derived from precision. However, recall that consistency is in fact an equivalence relation lifted from static types to gradual types. Therefore defining consistency independently is straightforward, and it is theoretically viable to validate the definition of consistency directly. On the other hand, precision is usually connected with the gradual criteria [25], and finding a correct partial order that adheres to the criteria is not always an easy task. For example, Igarashi et al. [14] argued that term precision for System \(F_G\) is actually nontrivial, leaving the gradual guarantee of the semantics as a conjecture. Thus precision can be difficult to extend to more sophisticated type systems, e.g. dependent types.
Still, it is interesting that those two definitions illustrate the correspondence of different foundations (on simple types): one is defined directly on gradual types, and the other stems from AGT, which is based on static subtyping.
3.5 Consistent Subtyping Without Existentials
Definition 2 serves as a fine specification of how consistent subtyping should behave in general. But it is inherently nondeterministic because of the two intermediate types C and D. As with Definition 1, we need a combined relation to directly compare two types. A natural attempt is to try to extend the restriction operator for polymorphic types. Unfortunately, as we show below, this does not work. However it is possible to devise an equivalent inductive definition instead.
Attempt to Extend the Restriction Operator. Suppose that we try to extend the restriction operator to account for polymorphic types. The original restriction operator is structural, meaning that it works for types of similar structures. But for polymorphic types, two input types could have different structures due to universal quantifiers, e.g., \(\forall a{.} a \rightarrow \mathsf {Int}\) and \((\mathsf {Int}\rightarrow \star ) \rightarrow \mathsf {Int}\). If we try to mask the first type using the second, it seems hard to maintain the information that a should be instantiated to a function while ensuring that the return type is masked. There seems to be no satisfactory way to extend the restriction operator in order to support this kind of nonstructural masking.
\(\varvec{\mathcal {T}}{\mathbf {heorem~1.}}\) \(\varPsi {} \vdash A \lesssim B \Leftrightarrow \varPsi {} \vdash A <:C\), \(C \sim D\), \(\varPsi {} \vdash D <:B\) for some C, D.
4 Gradually Typed Implicit Polymorphism
4.1 Typing in Detail
Figure 8 gives the typing rules for our declarative system (the reader is advised to ignore the grayshaded parts for now). Rule Var extracts the type of the variable from the typing context. Rule Nat always infers integer types. Rule LamAnn puts x with type annotation A into the context, and continues type checking the body e. Rule Lam assigns a monotype \(\tau \) to x, and continues type checking the body e. Gradual types and polymorphic types are introduced via annotations explicitly. Rule Gen puts a fresh type variable a into the type context and generalizes the typing result A to \(\forall a{.} A\). Rule App first infers the type of \(e_1\), then the matching judgment \(\varPsi {} \vdash A \triangleright A_1 \rightarrow A_2\) extracts the domain type \(A_1\) and the codomain type \(A_2\) from type A. The type \(A_3\) of the argument \(e_2\) is then compared with \(A_1\) using the consistent subtyping judgment.
Matching. The matching judgment of Siek et al. [25] can be extended to polymorphic types naturally, resulting in \(\varPsi {} \vdash A \triangleright A_1 \rightarrow A_2\). In MForall, a monotype \(\tau \) is guessed to instantiate the universal quantifier a. This rule is inspired by the application judgment \(\varPhi \vdash A \bullet e \Rightarrow C\) [11], which says that if we apply a term of type A to an argument e, we get something of type C. If A is a polymorphic type, the judgment works by guessing instantiations until it reaches an arrow type. Matching further simplifies the application judgment, since it is independent of typing. Rule MArr and MUnknown are the same as Siek et al. [25]. MArr returns the domain type \(A_1\) and range type \(A_2\) as expected. If the input is \(\star \), then MUnknown returns \(\star \) as both the type for the domain and the range.
Note that matching saves us from having a subsumption rule (Sub in Fig. 2). the subsumption rule is incompatible with consistent subtyping, since the latter is not transitive. A discussion of a subsumption rule based on normal subtyping can be found in the appendix.
4.2 TypeDirected Translation
The translation is given in the grayshaded parts in Fig. 8. The only interesting case here is to insert explicit casts in the application rule. Note that there is no need to translate matching or consistent subtyping, instead we insert the source and target types of a cast directly in the translated expressions, thanks to the following two lemmas:
\(\varvec{\mathcal {L}}{\mathbf {emma~1}}\) (\(\triangleright \) to \(\prec \)). If \(\varPsi {} \vdash A \triangleright A_1 \rightarrow A_2\), then \(A \prec A_1 \rightarrow A_2\).
\(\varvec{\mathcal {L}}{\mathbf {emma~2}}\) (\(\lesssim \) to \(\prec \)). If \(\varPsi {} \vdash A \lesssim B\), then \(A \prec B\).
In order to show the correctness of the translation, we prove that our translation always produces welltyped expressions in \(\lambda \mathsf {B}\). By \(\mathcal {L}{\text {ammas}}\) 1 and 2, we have the following theorem:
\(\varvec{\mathcal {T}}{\mathbf {heorem~2}}\) (Type Safety). If \(\varPsi {} \vdash e : A \rightsquigarrow {s}\), then \(\varPsi \vdash ^{ B }s : A\).
Parametricity. An important semantic property of polymorphic types is relational parametricity [19]. The parametricity property says that all instances of a polymorphic function should behave uniformly. A classic example is a function with the type \(\forall a{.} a \rightarrow a\). The parametricity property guarantees that a value of this type must be either the identity function (i.e., \(\lambda x . x\)) or the undefined function (one which never returns a value). However, with the addition of the unknown type \(\star \), careful measures are to be taken to ensure parametricity. This is exactly the circumstance that \(\lambda \mathsf {B}\) was designed to address. Ahmed et al. [2] proved that \(\lambda \mathsf {B}\) satisfies relational parametricity. Based on their result, and by \(\mathcal {T}{\text {heorem}}\) 2, parametricity is preserved in our system.
If we apply \(\lambda x:\star .~{f ~ x}\) to 3, which is fine since the function can take any input, the first translation runs smoothly in \(\lambda \mathsf {B}\), while the second one will raise a cast error (\(\mathsf {Int}\) cannot be cast to \(\mathsf {Bool}\)). Similarly, if we apply it to \(\mathsf {true}\), then the second succeeds while the first fails. The culprit lies in the highlighted parts where any instantiation of a would be put inside the explicit cast. More generally, any choice introduces an explicit cast to that type in the translation, which causes a runtime cast error if the function is applied to a value whose type does not match the guessed type. Note that this does not compromise the type safety of the translated expressions, since cast errors are part of the type safety guarantees.
Coherence. The ambiguity of translation seems to imply that the declarative system is incoherent. A semantics is coherent if distinct typing derivations of the same typing judgment possess the same meaning [20]. We argue that the declarative system is “coherent up to cast errors” in the sense that a welltyped program produces a unique value, or results in a cast error. In the above example, whatever the translation might be, applying \(\lambda x:\star .~{f ~ x}\) to 3 either results in a cast error, or produces 3, nothing else.
This discrepancy is due to the guessing nature of the declarative system. As far as the declarative system is concerned, both \(\mathsf {Int}\rightarrow \mathsf {Int}\) and \(\mathsf {Bool}\rightarrow \mathsf {Bool}\) are equally acceptable. But this is not the case at runtime. The acute reader may have found that the only appropriate choice is to instantiate f to \(\star \rightarrow \star \). However, as specified by rule MForall in Fig. 8, we can only instantiate type variables to monotypes, but \(\star \) is not a monotype! We will get back to this issue in Sect. 6.2 after we present the corresponding algorithmic system in Sect. 5.
4.3 Correctness Criteria
Siek et al. [25] present a set of properties that a welldesigned gradual typing calculus must have, which they call the refined criteria. Among all the criteria, those related to the static aspects of gradual typing are well summarized by Cimini and Siek [8]. Here we review those criteria and adapt them to our notation. We have proved in Coq that our type system satisfies all these criteria.
 Conservative extension: for all static \(\varPsi \), e, and A,

if \(\varPsi \vdash ^{ OL }e : A \), then there exists B, such that \(\varPsi \vdash e : B\), and \(\varPsi {} \vdash B <:A\).

if \(\varPsi \vdash e : A\), then \(\varPsi \vdash ^{ OL }e : A \)


Monotonicity w.r.t. precision: for all \(\varPsi , e, e', A\), if \(\varPsi \vdash e : A\), and \(e' \sqsubseteq e\), then \(\varPsi \vdash e' : B\), and \(B \sqsubseteq A\) for some B.

Type Preservation of cast insertion: for all \(\varPsi , e, A\), if \(\varPsi \vdash e : A\), then \(\varPsi \vdash e : A \rightsquigarrow {s}\), and \(\varPsi \vdash ^{ B }s : A\) for some s.

Monotonicity of cast insertion: for all \(\varPsi , e_1, e_2, e_1', e_2', A\), if \(\varPsi \vdash e_1 : A \rightsquigarrow {e}_1'\), and \(\varPsi \vdash e_2 : A \rightsquigarrow {e}_2'\), and \(e_1 \sqsubseteq e_2\), then \(\varPsi \shortmid \varPsi \vdash e_1' \sqsubseteq ^{ B }e_2'\).
The first criterion states that the gradual type system should be a conservative extension of the original system. In other words, a static program that is typeable in the OderskyLäufer type system if and only if it is typeable in the gradual type system. A static program is one that does not contain any type \(\star \)^{7}. However since our gradual type system does not have the subsumption rule, it produces more general types.
The second criterion states that if a typeable expression loses some type information, it remains typeable. This criterion depends on the definition of the precision relation, written \(A \sqsubseteq B\), which is given in the appendix. The relation intuitively captures a notion of types containing more or less unknown types (\(\star \)). The precision relation over types lifts to programs, i.e., \(e_1 \sqsubseteq e_2\) means that \(e_1\) and \(e_2\) are the same program except that \(e_2\) has more unknown types.
The first two criteria are fundamental to gradual typing. They explain for example why these two programs \((\lambda x:\mathsf {Int}.~{x + 1})\) and \((\lambda x:\star .~{x + 1})\) are typeable, as the former is typeable in the OderskyLäufer type system and the latter is a lessprecise version of it.
The last two criteria relate the compilation to the cast calculus. The third criterion is essentially the same as \(\mathcal {T}{\text {heorem}}\) 2, given that a target expression should always exist, which can be easily seen from Fig. 8. The last criterion ensures that the translation must be monotonic over the precision relation \(\sqsubseteq \).
As for the dynamic guarantee, things become a bit murky for two reasons: (1) as we discussed before, our declarative system is incoherent in that the runtime behaviour of the same source program can vary depending on the particular translation; (2) it is still unknown whether dynamic guarantee holds in \(\lambda \mathsf {B}\). We will have more discussion on the dynamic guarantee in Sect. 6.3.
5 Algorithmic Type System
In this section we give a bidirectional account of the algorithmic type system that implements the declarative specification. The algorithm is largely inspired by the algorithmic bidirectional system of Dunfield and Krishnaswami [11] (henceforth DK system). However our algorithmic system differs from theirs in three aspects: (1) the addition of the unknown type \(\star \); (2) the use of the matching judgment; and (3) the approach of gradual inference only producing static types [12]. We then prove that our algorithm is both sound and complete with respect to the declarative type system. Full proofs can be found in the appendix.
5.1 Algorithmic Consistent Subtyping and Instantiation
5.2 Algorithmic Typing
We now turn to the algorithmic typing rules in Fig. 12. The algorithmic system uses bidirectional type checking to accommodate polymorphism. Most of them are quite standard. Perhaps rule AApp (which differs significantly from that in the DK system) deserves attention. It relies on the algorithmic matching judgment \(\varGamma \vdash A \triangleright A_1 \rightarrow A_2 \dashv \varDelta \). Rule AMForallL replaces a with a fresh existential variable \(\widehat{a}\), thus eliminating guessing. Rule AMArr and AMUnknown correspond directly to the declarative rules. Rule AMVar, which has no corresponding declarative version, is similar to InstRArr/InstLArr: we create \(\widehat{a}\) and \(\widehat{b}\) and add \(\widehat{c}= \widehat{a}\rightarrow \widehat{b}\) to the context.
5.3 Completeness and Soundness
We prove that the algorithmic rules are sound and complete with respect to the declarative specifications. We need an auxiliary judgment \(\varGamma \longrightarrow \varDelta \) that captures a notion of information increase from input contexts \(\varGamma \) to output contexts \(\varDelta \) [11].
Soundness. Roughly speaking, soundness of the algorithmic system says that given an expression e that type checks in the algorithmic system, there exists a corresponding expression \(e'\) that type checks in the declarative system. However there is one complication: e does not necessarily have more annotations than \(e'\). For example, by ALam we have \(\lambda {x}.~{x} \Leftarrow (\forall a{.} a) \rightarrow (\forall a{.} a)\), but \(\lambda {x}.~{x}\) itself cannot have type \((\forall a{.} a) \rightarrow (\forall a{.} a)\) in the declarative system. To circumvent that, we add an annotation to the lambda abstraction, resulting in \(\lambda x:(\forall a{.} a).~{x}\), which is typeable in the declarative system with the same type. To relate \(\lambda {x}.~{x}\) and \(\lambda x:(\forall a{.} a).~{x}\), we erase all annotations on both expressions. The definition of erasure \(\lfloor {\cdot }\rfloor \) is standard and thus omitted.
Theorem 1
 1.
If \(\varGamma \vdash e \Rightarrow A \dashv \varDelta \) then \(\exists e'\) such that \([\varOmega ]\varDelta \vdash e' : [\varOmega ]A\) and \(\lfloor {e}\rfloor = \lfloor {e'}\rfloor \).
 2.
If \(\varGamma \vdash e \Leftarrow A \dashv \varDelta \) then \(\exists e'\) such that \([\varOmega ]\varDelta \vdash e' : [\varOmega ]A\) and \(\lfloor {e}\rfloor = \lfloor {e'}\rfloor \).
Completeness. Completeness of the algorithmic system is the reverse of soundness: given a declarative judgment of the form \([\varOmega ]\varGamma \vdash [\varOmega ]\dots \), we want to get an algorithmic derivation of \(\varGamma \vdash \dots \dashv \varDelta \). It turns out that completeness is a bit trickier to state in that the algorithmic rules generate existential variables on the fly, so \(\varDelta \) could contain unsolved existential variables that are not found in \(\varGamma \), nor in \(\varOmega \). Therefore the completeness proof must produce another complete context \(\varOmega '\) that extends both the output context \(\varDelta \), and the given complete context \(\varOmega \). As with soundness, we need erasure to relate both expressions.
Theorem 2
(Completeness of Algorithmic Typing). Given \(\varGamma \longrightarrow \varOmega \) and \(\varGamma \vdash A \), if \([\varOmega ]\varGamma \vdash e : A\) then there exist \(\varDelta \), \(\varOmega '\), \(A'\) and \(e'\) such that \(\varDelta \longrightarrow \varOmega '\) and \(\varOmega \longrightarrow \varOmega '\) and \(\varGamma \vdash e' \Rightarrow A' \dashv \varDelta \) and \(A = [\varOmega ']A'\) and \(\lfloor {e}\rfloor = \lfloor {e'}\rfloor \).
6 Discussion
6.1 Top Types
To demonstrate that our definition of consistent subtyping (Definition 2) is applicable to other features, we show how to extend our approach to \(\mathsf {Top}\) types with all the desired properties preserved.
It is easy to verify that Definition 2 is still equivalent to that in Fig. 7 extended with rule CSTop. That is, \(\mathcal {T}{\text {heorem}}\) 1 holds:
Proposition 4
(Extension with \(\top \)). \(\varPsi {} \vdash A \lesssim B \Leftrightarrow \varPsi {} \vdash A <:C\), \(C \sim D\), \(\varPsi {} \vdash D <:B\), for some C, D.
We extend the definition of concretization (Definition 3) with \(\top \) by adding another equation \(\gamma (\top ) = \{\top \}\). Note that Castagna and Lanvin [7] also have this equation in their calculus. It is easy to verify that Proposition 2 still holds:
Proposition 5
(Equivalent to AGT on \(\top \)). \(A \lesssim B\) if only if \(A ~\widetilde{<:}~B\).
Siek and Taha’s [22] Definition of Consistent Subtyping Does Not Work for \(\top \). As the analysis in Sect. 3.2, \(\mathsf {Int}\rightarrow \star \lesssim \top \) only holds when we first apply consistency, then subtyping. However we cannot find a type A such that \(\mathsf {Int}\rightarrow \star <:A\) and \(A \sim \top \). Also we have a similar problem in extending the restriction operator: nonstructural masking between \(\mathsf {Int}\rightarrow \star \) and \(\top \) cannot be easily achieved.
6.2 Interpretation of the Dynamic Semantics
To emphasize the difference and have better support for dynamic semantics, we could have gradual variables in addition to existential variables, with the difference that only unsolved gradual variables are allowed to be unified with the unknown type. An irreversible transition from existential variables to gradual variables occurs when an existential variable is compared with \(\star \). After the algorithm terminates, we can set all unsolved existential variables to be any (static) type (or more precisely, as Garcia and Cimini [12], with static type parameters), and all unsolved gradual variables to be \(\star \) (or gradual type parameters). However, this approach requires a more sophisticated declarative/algorithmic type system than the ones presented in this paper, where we only produce static monotypes in type inference. We believe this is a typical tradeoff in existing gradual type systems with inference [12, 23]. Here we suppress the complexity of dynamic semantics in favour of the conciseness of static typing.
6.3 The Dynamic Guarantee
In Sect. 4.3 we mentioned that the dynamic guarantee is closely related to the coherence issue. To aid discussion, we first give the definition of dynamic guarantee as follows:
Definition 5
(Dynamic guarantee). Suppose \(e' \sqsubseteq e\), \( \emptyset \vdash e : A \rightsquigarrow {s}\) and \( \emptyset \vdash e' : A' \rightsquigarrow {s}'\), if \(s \Downarrow v\), then \(s' \Downarrow v'\) and \(v' \sqsubseteq v\).
The left one evaluates to 3, whereas its less precise version (right) will give a cast error if a is instantiated to \(\mathsf {Bool}\) for example.
As discussed in Sect. 6.2, we could design a more sophisticated declarative/algorithmic type system where coherence is retained. However, even with a coherent source language, the dynamic guarantee is still a question. Currently, the dynamic guarantee for our target language \(\lambda \mathsf {B}\) is still an open question. According to Igarashi et al. [14], the difficulty lies in the definition of term precision that preserves the semantics.
7 Related Work
Along the way we discussed some of the most relevant work to motivate, compare and promote our gradual typing design. In what follows, we briefly discuss related work on gradual typing and polymorphism.
Gradual Typing. The seminal paper by Siek and Taha [21] is the first to propose gradual typing. The original proposal extends the simply typed lambda calculus by introducing the unknown type \(\star \) and replacing type equality with type consistency. Later Siek and Taha [22] incorporated gradual typing into a simple object oriented language, and showed that subtyping and consistency are orthogonal – an insight that partly inspired our work. We show that subtyping and consistency are orthogonal in a much richer type system with higherrank polymorphism. Siek et al. [25] proposed a set of criteria that provides important guidelines for designers of gradually typed languages. Cimini and Siek [8] introduced the Gradualizer, a general methodology for generating gradual type systems from static type systems. Later they also develop an algorithm to generate dynamic semantics [9]. Garcia et al. [13] introduced the AGT approach based on abstract interpretation.
Gradual Type Systems with Explicit Polymorphism. Ahmed et al. [1] proposed \(\lambda \mathsf {B}\) that extends the blame calculus [29] to incorporate polymorphism. The key novelty of their work is to use dynamic sealing to enforce parametricity. Devriese et al. [10] proved that embedding of System F terms into \(\lambda \mathsf {B}\) is not fully abstract. Igarashi et al. [14] also studied integrating gradual typing with parametric polymorphism. They proposed System F\(_G\), a gradually typed extension of System F, and System F\(_C\), a new polymorphic blame calculus. As has been discussed extensively, their definition of type consistency does not apply to our setting (implicit polymorphism). All of these approaches mix consistency with subtyping to some extent, which we argue should be orthogonal.
Gradual Type Inference. Siek and Vachharajani [23] studied unificationbased type inference for gradual typing, where they show why three straightforward approaches fail to meet their design goals. Their type system infers gradual types, which results in a complicated type system and inference algorithm. Garcia and Cimini [12] presented a new approach where gradual type inference only produces static types, which is adopted in our type system. They also deal with letpolymorphism (rank 1 types). However none of these works deals with higherranked implicit polymorphism.
HigherRank Implicit Polymorphism. Odersky and Läufer [17] introduced a type system for higherrank types. Based on that, Peyton Jones et al. [18] developed an approach for type checking higherrank predicative polymorphism. Dunfield and Krishnaswami [11] proposed a bidirectional account of higherrank polymorphism, and an algorithm for implementing the declarative system, which serves as a sole inspiration for our algorithmic system. The key difference, however, is the integration of gradual typing. Vytiniotis et al. [28] defers static type errors to runtime, which is fundamentally different from gradual typing, where programmers can control over static or runtime checks by precision of the annotations.
8 Conclusion
In this paper, we present a generalized definition of consistent subtyping, which is proved to be applicable to both polymorphic and top types. Based on the new definition of consistent subtyping, we have developed a gradually typed calculus with predicative implicit higherrank polymorphism, and an algorithm to implement it. As future work, we are interested to investigate if our results can scale to real world languages and other programming language features.
Footnotes
 1.
 2.
All supplementary materials are available at https://bitbucket.org/xieningning/consistentsubtyping.
 3.
This is a simplified version.
 4.
For simplification, we directly regard type constructor \(\rightarrow \) as a setlevel operator.
 5.
The definition of precision of types is given in appendix.
 6.
Theorems with \(\mathcal {T}\) are those proved in Coq. The same applies to \(\mathcal {L}\)emmas.
 7.
Note that the term static has appeared several times with different meanings.
Notes
Acknowledgements
We thank Ronald Garcia and the anonymous reviewers for their helpful comments. This work has been sponsored by the Hong Kong Research Grant Council projects number 17210617 and 17258816.
References
 1.Ahmed, A., Findler, R.B., Siek, J.G., Wadler, P.: Blame for all. In: Proceedings of the 38th Symposium on Principles of Programming Languages (2011)Google Scholar
 2.Ahmed, A., Jamner, D., Siek, J.G., Wadler, P.: Theorems for free for free: parametricity, with and without types. In: Proceedings of the 22nd International Conference on Functional Programming (2017)CrossRefGoogle Scholar
 3.Schwerter, F.B., Garcia, R., Tanter, É.: A theory of gradual effect systems. In: Proceedings of the 19th International Conference on Functional Programming (2014)Google Scholar
 4.Bierman, G., Meijer, E., Torgersen, M.: Adding dynamic types to C\(^\sharp \). In: D’Hondt, T. (ed.) ECOOP 2010. LNCS, vol. 6183, pp. 76–100. Springer, Heidelberg (2010). https://doi.org/10.1007/9783642141072_5CrossRefGoogle Scholar
 5.Bierman, G., Abadi, M., Torgersen, M.: Understanding TypeScript. In: Jones, R. (ed.) ECOOP 2014. LNCS, vol. 8586, pp. 257–281. Springer, Heidelberg (2014). https://doi.org/10.1007/9783662442029_11CrossRefGoogle Scholar
 6.BonnaireSergeant, A., Davies, R., TobinHochstadt, S.: Practical optional types for clojure. In: Thiemann, P. (ed.) ESOP 2016. LNCS, vol. 9632, pp. 68–94. Springer, Heidelberg (2016). https://doi.org/10.1007/9783662494981_4CrossRefGoogle Scholar
 7.Castagna, G., Lanvin, V.: Gradual typing with union and intersection types. Proc. ACM Program. Lang. 1(ICFP), 41:1–41:28 (2017)CrossRefGoogle Scholar
 8.Cimini, M., Siek, J.G.: The gradualizer: a methodology and algorithm for generating gradual type systems. In: Proceedings of the 43rd Symposium on Principles of Programming Languages (2016)Google Scholar
 9.Cimini, M., Siek, J.G.: Automatically generating the dynamic semantics of gradually typed languages. In: Proceedings of the 44th Symposium on Principles of Programming Languages (2017)Google Scholar
 10.Devriese, D., Patrignani, M., Piessens, F.: Parametricity versus the universal type. Proc. ACM Program. Lang. 2(POPL), 38 (2017)CrossRefGoogle Scholar
 11.Dunfield, J., Krishnaswami, N.R.: Complete and easy bidirectional typechecking for higherrank polymorphism. In: International Conference on Functional Programming (2013)Google Scholar
 12.Garcia, R., Cimini, M.: Principal type schemes for gradual programs. In: Proceedings of the 42nd Symposium on Principles of Programming Languages (2015)Google Scholar
 13.Garcia, R., Clark, A.M., Tanter, É.: Abstracting gradual typing. In: Proceedings of the 43rd Symposium on Principles of Programming Languages (2016)Google Scholar
 14.Igarashi, Y., Sekiyama, T., Igarashi, A.: On polymorphic gradual typing. In: Proceedings of the 22nd International Conference on Functional Programming (2017)Google Scholar
 15.Jafery, K.A., Dunfield, J.: Sums of uncertainty: refinements go gradual. In: Proceedings of the 44th Symposium on Principles of Programming Languages (2017)Google Scholar
 16.Mitchell, J.C.: Polymorphic type inference and containment. In: Logical Foundations of Functional Programming (1990)Google Scholar
 17.Odersky, M., Läufer, K.: Putting type annotations to work. In: Proceedings of the 23rd Symposium on Principles of Programming Languages (1996)Google Scholar
 18.Jones, S.P., Vytiniotis, D., Weirich, S., Shields, M.: Practical type inference for arbitraryrank types. J. Funct. Program. 17(1), 1–82 (2007)MathSciNetCrossRefGoogle Scholar
 19.Reynolds, J.C.: Types, abstraction and parametric polymorphism. In: Proceedings of the IFIP 9th World Computer Congress (1983)Google Scholar
 20.Reynolds, J.C.: The coherence of languages with intersection types. In: Ito, T., Meyer, A.R. (eds.) TACS 1991. LNCS, vol. 526, pp. 675–700. Springer, Heidelberg (1991). https://doi.org/10.1007/3540544151_70CrossRefGoogle Scholar
 21.Siek, J.G., Taha, W.: Gradual typing for functional languages. In: Proceedings of the 2006 Scheme and Functional Programming Workshop (2006)Google Scholar
 22.Siek, J., Taha, W.: Gradual typing for objects. In: Ernst, E. (ed.) ECOOP 2007. LNCS, vol. 4609, pp. 2–27. Springer, Heidelberg (2007). https://doi.org/10.1007/9783540735892_2CrossRefGoogle Scholar
 23.Siek, J.G., Vachharajani, M.: Gradual typing with unificationbased inference. In: Proceedings of the 2008 Symposium on Dynamic Languages (2008)Google Scholar
 24.Siek, J.G., Wadler, P.: The key to blame: gradual typing meets cryptography (draft) (2016)Google Scholar
 25.Siek, J.G., Vitousek, M.M., Cimini, M., Boyland, J.T.: Refined criteria for gradual typing. In: LIPIcsLeibniz International Proceedings in Informatics (2015)Google Scholar
 26.Verlaguet, J.: Facebook: analyzing PHP statically. In: Proceedings of Commercial Users of Functional Programming (2013)Google Scholar
 27.Vitousek, M.M., Kent, A.M., Siek, J.G., Baker, J.: Design and evaluation of gradual typing for Python. In: Proceedings of the 10th Symposium on Dynamic Languages (2014)Google Scholar
 28.Vytiniotis, D., Jones, S.P., Magalhães, J.P.: Equality proofs and deferred type errors: a compiler pearl. In: Proceedings of the 17th International Conference on Functional Programming, ICFP 2012, New York (2012)Google Scholar
 29.Wadler, P., Findler, R.B.: Welltyped programs can’t be blamed. In: Castagna, G. (ed.) ESOP 2009. LNCS, vol. 5502, pp. 1–16. Springer, Heidelberg (2009). https://doi.org/10.1007/9783642005909_1CrossRefGoogle Scholar
Copyright information
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made. The images or other third party material in this book are included in the book's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the book's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.