Keywords

1 Introduction

Mathematical structures are the backbone of the axiomatic method advocated by Bourbaki [8, 9] to spell out mathematically relevant abstractions and establish the corresponding vocabulary and notations. They are instrumental in making the mathematical literature more precise, concise, and intelligible. Modern libraries of formalized mathematics also rely on hierarchies of mathematical structures to achieve modularity, akin to interfaces in generic programming. By analogy, we call instance a witness of a mathematical structure on a given carrier. Mathematical structures, as interfaces, are essential to factor out the shared vocabulary attached to their instances. This vocabulary comes in the form of formal definitions and generic theorems, but also parsable and printable notations, and sometimes delimited automation. Some mathematical structures are richer than others in the sense that they extend them. Like in generic programming, rich structures inherit the vocabulary attached to poorer structures. Working out the precise meaning of symbols of this shared vocabulary is usually performed by enhanced type inference, which is implemented using type classes [5, 27, 28] or unification hints [3, 17, 20, 24]. In particular, these mechanisms must automatically identify inheritance relations between structures.

This paper discusses the design of a hierarchy of mathematical structures supporting a Coq [30] formal library for functional analysis, i.e., the study of spaces of functions, and of structure-preserving transformations on them. The algebraic vocabulary of linear algebra is complemented with a suitable notion of “closeness” (e.g., topology, distance, norm), so as to formalize convergence, limits, size, etc. This hierarchy is based on the packed classes methodology [17, 20], which represents structures using dependent records. The library strives to provide notations and theories that are as generic as they would be on paper. It is an extension of the “Mathematical Components” library [32] (hereafter MathComp), which is geared towards algebra. This extension is inspired by the Coquelicot real analysis library [7], which has its own hierarchy.

Fusing these two hierarchies, respectively from the MathComp and Coquelicot libraries, happens to trigger several interesting issues, related to inheritance relations. Indeed, when several constructions compete to infer an instance of the poorer structure, the proof assistant displays indiscernible notations, or keywords, for constructions that are actually different. This issue is not at all specific to the formalization of functional analysis: actually, the literature reports examples of this exact problem but in different contexts, e.g., in Lean’s mathlib [11, 33]. It is however more likely to happen when organizing different flavors of mathematics in a same coherent corpus, as this favors the presence of problematic competing constructions. Up to our knowledge, the problem of competing inheritance paths in hierarchies of dependent records was never discussed per se, beyond some isolated reports of failure, and of ad hoc solutions. We thus present and discuss a general methodology to overcome this issue, coined forgetful inheritance, based on packed classes and unification hints.

The paper is organized as follows: in Sect. 2, we recall the packed classes methodology, using a running example. Section 3 provides two concrete examples of the typical issues raised by the presence of competing inheritance paths, before describing the general issue, drawing its solution, and comparing with other type-class-like mechanisms. Finally, Sect. 4 describes the design of our hierarchy of structures for functional analysis, and its features, before Sect. 5 concludes.

2 Structures, Inheritance, Packed Classes

We recall some background on the representation of mathematical structures in dependent type theory, and on the construction of hierarchies using packed classes. For that purpose, we use a toy running example (see the accompanying file  [1]), loosely based on the case study presented in Sect. 4.

2.1 Dependent Records

In essence, a mathematical structure attaches to a carrier set some data (e.g., operators of the structure, collections of subsets of the carrier) and prescribed properties about these data, called the axioms of the structure. The Calculus of Inductive Constructions [16], as implemented, e.g., by Coq [30], Agda [29], or Lean [21, 22], provides a dependent record construct, which allows to represent a given mathematical structure as a type, and its instances as terms of that type. A dependent record is an inductive type with a single constructor, which generalizes dependent pairs to dependent tuples. The elements of such a tuple are the arguments of the single constructor. They form a telescope [10], i.e., collection of terms, whose types can depend on the previous items in the tuple.

For example, the record type formalizes a structure with two operators, and , and one axiom . This structure is a toy generalisation for the mathematical notion of normed module. Its purpose is to simulate one basic axiom of norms via a minimal amount of constructors. Thus, the has a single constructor named , and four projections (also called fields) , , , and , onto the respective components of the tuple:

figure l

The axioms makes use of to avoid the introduction of a of carrier type. Fields have a dependent type, parameterized by the one of the structure:

figure p

In this case, declaring an instance of this structure amounts to defining a term of type , which packages the corresponding instances of carrier, data, and proofs. For example, here is an instance on carrier (using the and functions from the standard library resp. as the and the operators):

figure w

2.2 Inference of Mathematical Structures

Hierarchies of mathematical structures are formalized by nesting dependent records but naive approaches quickly incur scalability issues. Packed classes [17, 20] provide a robust and systematic approach to the organization of structures into hierarchies. In this approach, a structure is a two-field record, which associates a carrier with a class. A class encodes the inheritance relations of the structure and packages various mixins. Mixins in turn provide the data, and their properties. In Coq, and are synonyms, but we reserve the latter for record types that represent actual structures. Let us explain the basics of inference of mathematical structures with packed classes by replacing the structure of Sect. 2.1 with two structures represented as packed classes. The first one provides just a binary operator:

figure z

Since the structure is expected to be the bottom of the hierarchy, we are in the special class where the class is the same as the mixin (here, the class would be equal to ). To endow the operator with a generic infix notation, we introduce a definition , parameterized by an instance of . In the definition of the corresponding notation, the wildcard is a placeholder for the instance of to be inferred from the context.

figure ah

We can build an instance of the structure with the type of integers as the carrier and the subtraction of integers for the operator:

figure aj

But defining an instance is not enough to make the  notation available:

figure al

To type-check the expression just above, Coq needs to fill the wildcard in the notation, which amounts to solving the equation , where is the definitional equality (i.e., equality up to the conversion rules of Coq type theory [30, Section “Conversion Rules” in Chapter “Calculus of Constructions”]). One can indicate that the instance is a canonical solution by declaring it as a canonical instance:

figure aq

This way, Coq fills the wildcard in with and retrieves as expected the subtraction for integers.

2.3 Inheritance and Packed Classes

We introduce a second structure class to illustrate how inheritance is implemented. This structure extends the structure of Sect. 2.2 with a norm operator and a property (the fact that is 0 for any ):

figure aw

The new mixin for the norm appears at line 1 (it takes a structure as parameter), the new class appears at line 4, and the structureFootnote 1 at line 7. It is the class that defines the inheritance relation between and (at line 6 precisely). The definitions above are however not enough to achieve proper inheritance. For example, does not yet enjoy the notation coming with the structure:

figure be

Here, Coq tries to solve the following equationFootnote 2:

figure bg

The solution consists in declaring a canonical way to build a structure out of a structure in the form of a function that Coq can use to solve the equation above (using in this particular case):

figure bk

3 Inheritance with Packed Classes and type classes

When several inheritance paths compete to establish that one structure extend the other, the proof assistant may display misleading information to the user and prevent proofs.

3.1 Competing Inheritance Paths

We extend the running example of Sect. 2 with a third and last structure. Our objective is to implement a toy generalisation of the mathematical notion of pseudometric space. This is done by introducing a reflexive relation mimicking the belonging of one of the argument to a unit ball around the other argument. The structure below provides a binary relation (line 2) together with a property of reflexivity (line 3):

figure bm

For the sake of the example, we furthermore declare a canonical way of building a structure out of a structure:

figure bp

We first illustrate the issue using a construction (here the Cartesian product) that preserves structures, and that is used to build canonical instances. First, we define the product of structures, and tag it as canonical:

figure br

Similarly, we define canonical products of and :

figure bu

The problem is that our setting leads Coq’s type-checker to fail in unexpected ways, as illustrated by the following example:

figure bv

The hypothesis applies to any goal where the type of  is of type , so that one may be led to think that it should also apply in the case of a product of , since there is a canonical way to build one. What happens is that the type-checker is looking for an instance of a normed module that satisfies the following equation:

figure cb

while the canonical instance Coq infers is , which does not satisfy the equation. In particular, is definitionally equal to on the left-hand side and on the right-hand side: the two are not definitionally equal. One can describe the problem as the fact that the diagram in Fig. 1 does not commute definitionally.

Fig. 1.
figure 1

Diagrammatic explanation for the failure of the first example of Sect. 3.1

This is of course not specific to Cartesian products and similar problems would also occur when lifting dependent products, free algebras, closure, completions, etc., on metric spaces, topological groups, etc. as well as in simpler settings without generic constructions as illustrated by our last example.

As a consequence of the definition of , the following lemma about balls is always true for any :

figure ci

For the sake of the example, we define canonical instances of the and structures with integers:

figure cl

Since the generic lemma holds, the user might expect to use it to prove a version specialized to integers. This is however not the case as the following script shows:

figure cn

The problem is that on the left-hand side Coq infers the instance with the relation, while on the right-hand side it infers the instance whose is definitionally equal to , which is not definitionally equal to the newly defined . In other words, the problem is the multiple ways to construct a “canonical instance” of with carrier , as in Fig. 2.

Fig. 2.
figure 2

Diagrammatic explanation for the type-checking failure of the second example of Sect. 3.1: the dashed arrows represent the inference of an instance from the carrier type; the outer diagrams commutes, while the inner one does not

The solution to the problems explained in this section is to ensure definitional equality by including poorer structures into richer ones; this way, “deducing” one structure from the other always amounts to erasure of data, and this guarantees there is a unique and canonical way of getting it. We call this technique forgetful inheritance, as it is reminiscent of forgetful functors in category theory.

3.2 Forgetful Inheritance with Packed Classes

When applied to the first problem exposed in Sect. 3.1, forgetful inheritance makes the diagram of Fig. 1 commute definitionally. Indeed, the only way to achieve commutation is to have be a mere erasure. This means that one needs to include inside each instance of a canonical (line 7 below). Furthermore the must record the compatibility between the operators and (line 4 below):

figure dc

Since every includes a canonical , the construction of the canonical given a is exactly a projection:

figure dh

As a consequence, the equation

figure di

holds with and the diagram in Fig. 1 (properly updated with the new structure) commutes definitionally, and so does the diagram in Fig. 2, for the same reasons.

Factories. Because of the compatibility axioms required by forgetful inheritance, the formal definition of a structure can depart from the expected presentation. In fact, with forgetful inheritance, the very definition of a mathematical structure should be read in factories, i.e., functions that construct the mixins starting from only the expected axioms. And records are rather interfaces, in a software engineering terminology. Note that just like there can be several equivalent presentations of a same mathematical stuctures, several mixins can be associated with a same target .

In our running example, one can actually derive, from the previously defined , two mixins for both :

figure dp

(where the relation is the one induced by the norm, by construction) and :

figure ds

These two mixins make the source of two factories we mark as coercions, in order to help building two structures:

figure du

The second part of this paper provides concrete examples of factories for our hierarchy for functional analysis.

3.3 Forgetful Inheritance with type classes

Type class mechanisms [5, 27, 28] propose an alternative implementation of hierarchies. Inference relations are coded using parameters rather than projections, and proof search happens by enhancing the resolution of implicit arguments. But the issue of competing inheritance paths does not pertain to the inference mechanism at stake, nor to the prover which implements them. Its essence rather lies in the non definitionally commutative diagrams as in Fig. 1 and Fig. 2.

We illustrate this with a type classes version of our examples, in both Coq and Lean, using a semi-bundled approach (see the accompanying files and  [1]). Compared to the packed class approach, hierarchies implemented using type classes remove the structure layer, which packages the carrier and the class. Hence our example keeps only the records whose name starts with is, declares them as type classes, and substitutes declarations with appropriate declarations.

The choice on the level of bundling in the resulting classes, i.e., what are parameters of the record, and what are its fields, is not unique. For example, one may choose to formalize rings as groups extended with additional operations and axioms, or as a class on a type which is also a group.

figure dz

By contrast, a structure in the packed class approach must always package with its carrier every mixins and classes that characterize the structure. The transposition of forgetful inheritance to type class would apply to fully bundled classes (where all the operations and axioms are bundled but not the carrier).

Because it triggers no “backtracking search”, the use of packed classes and unification hints described in this paper avoids some issues encountered in mathlib [33, Sect. 4.3], which are more explicitly detailed in the report on the implementation of type classes in Lean 4 [26]. We do not know either how a type class version of forgetful inheritance would interact with the performance issues described in the latter paper, or whether tabling helps. Moreover, with the current implementations of type classes in both Coq and Lean, different choices on bundling may have dramatic consequences on resolution feasibility and performance. For example, former experiments in rewriting MathComp with type classes in Coq did not scale up to modules on a ring. Incidentally, our companion file illustrates some predictability issues of the current implementation of Coq type classes.

4 The Mathematical Components Analysis Library

The Coquelicot library comes with its own hierarchy of mathematical structures and the intent of the MathComp-Analysis library is to improve it with the algebraic constructs of the MathComp library, for the analysis of multivariate functions for example. This section explains three applications of forgetful inheritance that solve three design issues of a different nature raised by merging MathComp and Coquelicot, as highlighted in Fig. 3.

We begin by an overview of the mathematical notions we deal with in this section. A topological space is a set endowed with a topology, i.e., a total collection of open sets closed under finite intersection and arbitrary unions. Equivalently, a topology can be described by the neighborhood filter of each point. A neighborhood of a point x is a set containing an open set around x; the neighborhood filter of a point x is the set of all neighborhoods of x. In MathComp-Analysis, neighborhood filters are the primary component of topological spaces. Pseudometric spaces are intermediate between topological and metric spaces. They were introduced as the minimal setting to handle Cauchy sequences. In Coquelicot, pseudometric spaces are called “uniform spaces” and are formalized as spaces endowed with a suitable predicate. This is the topic of Sect. 4.1. Coquelicot also provides normed spaces, i.e., \(\mathbb {K}\)-vector spaces E endowed with a suitable norm. On the other hand, in MathComp, the minimal structure with a norm operator corresponds to numerical domains [12, Chap. 4][13, Sect. 3.1], i.e., integral domains with order and absolute value. This situation led to a generalization of MathComp described in Sect. 4.2. Finally, in Sect. 4.3, we explain how to do forgetful inheritance across the two distinct libraries MathComp and MathComp-Analysis.

Fig. 3.
figure 3

Excerpt of MathComp and MathComp-Analysis hierarchies. Each rounded box corresponds to a mathematical structure except for (Com)(Unit)Ring that corresponds to several structures [17]. Dotted boxes correspond to mathematical structures introduced in Sect. 4.2 and Sect. 4.3. Thick, red arrows correspond to forgetful inheritance.

4.1 Forgetful Inheritance from Pseudometric to Topological Spaces

When formalizing topology, we run into a problem akin to Sect. 3.1 because we face several competing notions of neighborhoods; we solve this issue with forgetful inheritance as explained in Sect. 3.

A neighborhood of a point p can be defined at the level of topological spaces using the notion of open as a set A that contains an open set containing p:

$$\begin{aligned} \exists B.\ B \text { is open, } p \in B \text { and } B \subseteq A. \end{aligned}$$
(1)

or at the level of pseudometric spaces as a set A that contains a ball containing p:

$$\begin{aligned} \exists \varepsilon > 0.\ B_{\varepsilon }(p) \subseteq A. \end{aligned}$$
(2)

We ensure these two definitions of neighborhoods coincide by adding to mixins compatibility axioms that constrain a shared function. The function in question maps a point to a set of neighborhoods (hereafter ), it is shared between the mixins for topological and pseudometric spaces, and constrained by the definitions of open set and ball as in Formulas (1) and (2). More precisely, the mixin for topological spaces introduces the set of open sets (see line 3 below) and defines neighborhoods as in Formula (1) (at line 5). We complete the definition by specifying with a specific axiom (not explained in detail here) that neighborhoods are proper filters (line 4) and with an alternative characterization of open set (namely that an open set is a neighborhood of all of its points, line 6).

figure ed

The mixin for pseudometric spaces introduces the notion of balls (line 10) and defines neighborhoods as in Formula (2) (at line 12, corresponds to the set of supersets of balls). The rest of the definition (line 11) are axioms about  which are omitted for lack of space.

figure eg

Here, our definition of topological space departs from the standard definition as a space endowed with a family of subsets containing the full set and the empty set and closed under union and by finite intersection. However, the latter definition can be recovered from the former. Factories (see Sect. 3.2) are provided for users who want to give only  and to infer  (using [31, file topology.v, definition topologyOfOpenMixin]), or the other way around.

4.2 Forgetful Inheritance from Numerical Domain to Normed Abelian Group

The second problem we faced when developing the MathComp-Analysis library is the competing formal definitions of norms and absolute values. The setting is more complicated than Sect. 4.1 as it involves amending the hierarchy of mathematical structures of the MathComp library.

While the codomain of a norm is always the set of (non-negative) reals, an absolute value on a is always an endofunction of type . Thanks to this design choice, the absolute value preserves some information about its input, e.g., the integrality of an integer. On the other hand, the Coquelicot library also had several notions of norms: the absolute value of the real numbers (from the Coq standard library), the absolute value of a structure for rings equipped with an absolute value, and the norm operator of normed modules (the latter two are Coquelicot-specific).

We hence generalize the norm provided by the MathComp library to encompass both absolute values on numerical domains and norms on vector spaces, and share notation and lemmas. This is done by introducing a new structure in MathComp called , for normed Abelian groups, since Abelian groups are called \(\mathbb {Z}\)-modules in MathComp. This structure is now the poorest structure with a norm, which every subsequent normed type will inherit from.

The definition of the structure requires to solve a mutual dependency problem. Indeed, to state the fundamental properties of norms, such as the triangle inequality, the codomain of the norm function should be at least an ordered and normed Abelian group, requiring to be parameterized by such a structure. However the codomain should also inherit from to share the notations for norm and absolute value.

Our solution is to dispatch the order and the norm originally contained in between normed Abelian groups and partially ordered types as depicted in Fig. 3. We define the two following mixins for and .

figure ev

Now we define (which is an abbreviation for ) using these two mixins but without declaring inheritance from (yet to be defined). More precisely, the class of includes the mixin for (at line 5 below), which will allow for forgetful inheritance:

figure fb

Finally, we define the class of , parameterized by a :

figure fe

It is only then that we declare inheritance from the structure to , effectively implementing forgetful inheritance. We finally end up with a norm of the general type

figure fh

Illustration: sharing of norm notation and lemmas. As an example, we explain the construction of two norms and show how they share notation and lemmas. In MathComp, the type of matrices is where is the type of coefficients. The norm takes the maximum of the absolute values of the coefficients:

figure fl

This definition uses the generic big operator [6] to define a “big max” operation out of the binary operation  . Similarly, we define a norm for pairs of elements by taking the maximum of the absolute value of the two projectionsFootnote 3:

figure fp

We then go on proving that these definitions satisfy the properties of the norm and declare canonical instances of for matrices and pairs (see [31] for details). All this setting is of course carried out in advance and the user only sees one notation and one set of lemmas (for example for the triangle inequality), so that (s)he can mix various norms transparently in the same development, as in the following two examples:

figure fs

One could fear that the newly introduced structures make the library harder to use since that, to declare a canonical instance, a user also needs now to declare canonical and instances of the same type. Here, the idea of factories (Sect. 3.2) comes in handy for the original mixin has been re-designed as a factory producing , , and mixins in order to facilitate their declaration.

4.3 Forgetful Inheritance from Normed Modules to Pseudometric Spaces

The combination of the MathComp library with topological structures ultimately materializes as a mathematical structure for normed modules. It is made possible by introducing an intermediate structure that combines norm (from algebra) with pseudometric (from topology) into normed Abelian groups. The precise justification for this first step is as follows.

Since normed Abelian groups have topological and pseudometric space structures induced by the norm, should inherit from . To do so, we can (1) insert a new structure above , or (2) create a common extension of and . We choose (2) to avoid amending the MathComp library. This makes both and its extension normed Abelian groups, where the former is inadequate for topological purposes.

The only axiom of this extension is the compatibility between the pseudometric and the norm, as expressed line 5 below, where has been seen in Sect. 4.1 and the right-hand side represents all the ternary relations \(\lambda x,\varepsilon ,y.\,|x-y|<\varepsilon \):

figure gi

The extension is effectively performed by using this mixin in the following class definition at line 12 (see also Fig. 3):

figure gj

Finally, the bridge between algebraic and topological structures is completed by a common extension of a normed Abelian group ( ) with a left-module ( from the MathComp library, which provides scalar multiplication), extended with the axiom of linearity of the norm for the scalar product (line 5 below).

figure gm

One can again observe here the overloaded notation for norms explained in Sect. 4.2. The accompanying file  [1] provides an overview of MathComp-Analysis operations regarding norms and scalar notations.

We ensured that the structure of normed modules indeed serves its intended purpose of enabling multivariate functional analysis by generalizing existing theories of Bachmann-Landau notations and of differentiation [2, Sect. 4].

5 Conclusion and Related Work

This paper has two main contributions: forgetful inheritance using packed classes, and the hierarchy of the MathComp-Analysis library. The latter library is still in its infancy and covers far less real and complex analysis than the libraries available in HOL Light and Isabelle/HOL [19, 23]. However, differences in foundations matter here, and the use of dependent types in type-class-like mechanisms is instrumental in the genericity of notations illustrated in this paper. Up to our knowledge, no other existing formal library in analysis has comparable sharing features.

The methodology presented in this paper to tame competing inheritance paths in hierarchies of dependent records is actually not new. The original description of packed classes [17, end of Sect. 3.1] already mentions that a choice operator (in fact, a mixin) should be included in the definition of a structure for countable types, even if choice operators can be defined for countable types in Coq without axiom. Yet, although the MathComp library uses forgetful inheritance at several places in its hierarchy, this solution was never described in earlier publications, nor was the issue precisely described. Proper descriptions, as well as the comparison with other inference techniques, are contributions of the present paper.

As explained in Sect. 3.3, type classes based on augmented inference of implicit arguments also allow for a variant of forgetful inheritance. For instance, Buzzard et al. mention that this pattern is used for defining metric spaces both in Lean and Isabelle/HOL’s libraries [11, Sect. 3]. In the same paper, the authors describe another formalization issue, about completing abelian groups and rings, pertaining to the same problem [11, Sect. 5.3], and which can be solved as well using forgetful inheritance.

Extensional flavors of dependent type theory could make the problematic diagram in Fig. 1 commute judgmentally. However, to the best of our knowledge, the related tools [4, 15] available at the time of writing do not implement any of the type class mechanisms discussed here.

Packed classes, and forgetful inheritance, already proved robust and efficient enough to formalize and populate large hierarchies [18], where “large” applies both to the number of structures and to the number of instances. Arguably, this approach also has drawbacks: defining deep hierarchies becomes quite verbose, and inserting new structures is tedious and error-prone. We argue that, compared to their obvious benefits in control and efficiency of the proof search, this is not a fundamental issue. As packed classes are governed by systematic patterns and invariants, this rather calls for more inspection [25] and automated generation [14] tooling, which is work in progress.