Coalgebra Learning via Duality
Abstract
Automata learning is a popular technique for inferring minimal automata through membership and equivalence queries. In this paper, we generalise learning to the theory of coalgebras. The approach relies on the use of logical formulas as tests, based on a dual adjunction between states and logical theories. This allows us to learn, e.g., labelled transition systems, using HennessyMilner logic. Our main contribution is an abstract learning algorithm, together with a proof of correctness and termination.
1 Introduction
In recent years, automata learning is applied with considerable success to infer models of systems and in order to analyse and verify them. Most current approaches to active automata learning are ultimately based on the original algorithm due to Angluin [4], although numerous improvements have been made, in practical performance and in extending the techniques to different models [30].
Our aim is to move from automata to coalgebras [14, 26], providing a generalisation of learning to a wide range of statebased systems. The key insight underlying our work is that dual adjunctions connecting coalgebras and tailormade logical languages [12, 19, 21, 22, 26] allow us to devise a generic learning algorithm for coalgebras that is parametric in the type of system under consideration. Our approach gives rise to a fundamental distinction between states of the learned system and tests, modelled as logical formulas. This distinction is blurred in the classical DFA algorithm, where tests are also used to specify the (reachable) states. It is precisely the distinction between tests and states which allows us to move beyond classical automata, and use, for instance, HennessyMilner logic to learn bisimilarity quotients of labelled transition systems.
To present learning via duality we need to introduce new notions and refine existing ones. First, in the setting of coalgebraic modal logic, we introduce the new notion of subformula closed collections of formulas, generalising suffixclosed sets of words in Angluin’s algorithm (Sect. 4). Second, we import the abstract notion of base of a functor from [8], which allows us to speak about ‘successor states’ (Sect. 5). In particular, the base allows us to characterise reachability of coalgebras in a clear and concise way. This yields a canonical procedure for computing the reachable part from a given initial state in a coalgebra, thus generalising the notion of a generated subframe from modal logic.
We then rephrase coalgebra learning as the problem of inferring a coalgebra which is reachable, minimal and which cannot be distinguished from the original coalgebra held by the teacher using tests. This requires suitably adapting the computation of the reachable part to incorporate tests, and only learn ‘up to logical equivalence’. We formulate the notion of closed table, and an associated procedure to close tables. With all these notions in place, we can finally define our abstract algorithm for coalgebra learning, together with a proof of correctness and termination (Sect. 6). Overall, we consider this correctness and termination proof as the main contribution of the paper; other contributions are the computation of reachability via the base and the notion of subformula closedness. At a more conceptual level, our paper shows how states and tests interact in automata learning, by rephrasing it in the context of a dual adjunction connecting coalgebra (systems) and algebra (logical theories). As such, we provide a new foundation of learning statebased systems.
Related Work. The idea that tests in the learning algorithm should be formulas of a distinct logical language was proposed first in [6]. However, the work in loc.cit. is quite adhoc, confined to Booleanvalued modal logics, and did not explicitly use duality. This paper is a significant improvement: the dual adjunction framework and the definition of the base [8] enables us to present a description of Angluin’s algorithm in purely categorical terms, including a proof of correctness and, crucially, termination. Our abstract notion of logic also enables us to recover exactly the standard DFA algorithm (where tests are words) and the algorithm for learning Mealy machines (where test are manyvalued), something that is not possible in [6] where tests are modal formulas. Closely related to our work is also the line of research initiated by [15] and followed up within the CALF project [11, 12, 13] which applies ideas from category theory to automata learning. Our approach is orthogonal to CALF: the latter focuses on learning a general version of automata, whereas our work is geared towards learning bisimilarity quotients of statebased transition systems. While CALF lends itself to studying automata in a large variety of base categories, our work thus far is concerned with varying the type of transition structures.
2 Learning by Example
The aim of this section is twofold: (i) to remind the reader of the key elements of Angluin’s L\(^*\) algorithm [4] and (ii) to motivate and outline our generalisation.
In the classical L\(^*\) algorithm, the learner tries to learn a regular language \(\mathcal {L}\) over some alphabet A or, equivalently, a DFA \(\mathcal {A}\) accepting that language. Learning proceeds by asking queries to a teacher who has access to this automaton. Membership queries allow the learner to test whether a given word is in the language, and equivalence queries to test whether the correct DFA has been learned already. The algorithm constructs socalled tables (S, E) where \(S, E \subseteq A^*\) are the rows and columns of the table, respectively. The value at position (s, e) of the table is the answer to the membership query “\(se \in \mathcal {L}\)?”.
Words play a double role: On the one hand, a word \(w \in S\) represents the state which is reached when reading w at the initial state. On the other hand, the set E represents the set of membership queries that the learner is asking about the states in S. A table is closed if for all \(w \in S\) and all \(a \in A\) either \(wa \in S\) or there is a state \(v \in S\) such that wa is equivalent to v w.r.t. membership queries of words in E. If a table is not closed we extend S by adding words of the form wa for \(w \in S\) and \(a \in A\). Once it is closed, one can define a conjecture,^{1} i.e., a DFA with states in S. The learner now asks the teacher whether the conjecture is correct. If it is, the algorithm terminates. Otherwise the teacher provides a counterexample: a word on which the conjecture is incorrect. The table is now extended using the counterexample. As a result, the table is not closed anymore and the algorithm continues again by closing the table.
Our version of L\(^*\) introduces some key conceptual differences: tables are pairs \((S,\varPsi )\) such that S (set of rows) is a selection of states of \(\mathcal {A}\) and \(\varPsi \) (set of columns) is a collection of tests/formulas. Membership queries become checks of tests in \(\varPsi \) at states in S and equivalence queries verify whether or not the learned structure is logically equivalent to the original one. A table \((S,\varPsi )\) is closed if for all successors \(x'\) of elements of S there exists an \(x \in S\) such that x and \(x'\) are equivalent w.r.t. formulas in \(\varPsi \). The clear distinction between states and tests in our algorithm means that counterexamples are formulas that have to be added to \(\varPsi \). Crucially, the move from words to formulas allows us to use the rich theory of coalgebra and coalgebraic logic to devise a generic algorithm.
We consider two examples within our generic framework: classical DFAs, yielding essentially the L\(^*\) algorithm, and labelled transition systems, which is to the best of our knowledge not covered by standard automata learning algorithms.
As above, the algorithm constructs tables, starting with \((S = \{x_0\}, \varPsi = \emptyset )\). The table is closed, so the first conjecture is a single state with an aloop with no proposition letter true (note that \(x_0\) has no b or c successor and no proposition is true at \(x_0\)). It is, however, easy for the teacher to find a counterexample. For example, the formula \(\left\langle a \right\rangle \left\langle b \right\rangle \top \) is true at the root of the original LTS but false in the conjecture. We add the counterexample and all its subformulas to \(\varPsi \) and obtain a new table \((\{x_0\},\varPsi '\}\) with \(\varPsi ' = \{ \left\langle a \right\rangle \left\langle b \right\rangle \top , \left\langle b \right\rangle \top , \top \}\). Now, the table is not closed, as \(x_0\) has successor \(x_1\) that satisfies \(\left\langle b \right\rangle \top \) whereas \(x_0\) does not satisfy \(\left\langle b \right\rangle \top \). Therefore we add \(x_1\) to the table to obtain \((\{x_0,x_1\},\varPsi ')\). Similar arguments will lead to the closed table \((\{x_0,x_1,x_3,x_4\},\varPsi ')\) which also yields the correct conjecture. Note that the state \(x_2\) does not get added to the table as it is equivalent to \(x_1\) and thus already represented. This demonstrates a remarkable fact: we computed the bisimulation quotient of the LTS without inspecting the (infinite) righthand side of the LTS.
Another important example that fits smoothly into our framework is the wellknown variant of Angluin’s algorithm to learn Mealy machines (Example 2). Thanks to our general notion of logic, our framework allows to use an intuitive language, where a formula is simply an input word w whose truth value at a state x is the observed output after entering w at x. This is in contrast to [6] where formulas had to be Boolean valued. Multivalued logics fit naturally in our setting; this is expected to be useful to deal with systems with quantitative information.
3 Preliminaries
The general learning algorithm in this paper is based on the theory of coalgebras, which provides an abstract framework for representing statebased transition systems. In what follows we assume that the reader is familiar with basic notions of category theory and coalgebras [14, 26]. We briefly recall the notion of pointed coalgebra, modelling a coalgebra with an initial state. Let \(\mathcal {C}\) be a category with a terminal object 1 and let \(B:\mathcal {C} \rightarrow \mathcal {C}\) be a functor. A pointed \(B\)coalgebra is a triple \((X,\gamma ,x_0)\) where \(X \in \mathcal {C}\) and \(\gamma :X \rightarrow BX\) and \(x_0:1 \rightarrow X\), specifying the coalgebra structure and the point (“initial state”) of the coalgebra, respectively.
Coalgebraic Modal Logic. Modal logics are used to describe properties of statebased systems, modelled here as coalgebras. The close relationship between coalgebras and their logics is described elegantly via dual adjunctions [18, 20, 21, 24].
Example 1
Let \(\mathcal {C}= \mathcal {D} = \mathsf {Set}, P=Q= 2^{}\) the contravariant power set functor, \(B= 2 \times ^{A}\) and \(L = 1 + A \times \). In this case \(B\)coalgebras can be thought of as deterministic automata with input alphabet A (e.g., [25]). It is wellknown that the initial Lalgebra is \(\varPhi = A^{*}\) with structure \(\alpha = [\varepsilon , \mathrm {cons}]:1 + A \times A^* \rightarrow A^*\) where \(\varepsilon \) selects the empty word and \(\mathrm {cons}\) maps a pair \((a,w) \in A \times A^*\) to the word \(aw \in A^*\), i.e., in this example our tests are words with the intuitive meaning that a test succeeds if the word is accepted by the given automaton. For \(X \in \mathcal {C}\), the Xcomponent of the (onestep) semantics \(\delta :L P\Rightarrow PB\) is defined as follows: \( \delta _X (*) = \lbrace (i, f) \in 2 \times X^A \mid i = 1 \rbrace \), and \(\delta _X (a, U) = \lbrace (i, f) \in 2 \times X^A \mid f(a) \in U \rbrace \). It is matter of routine checking that the semantics of tests in \(\varPhi \) on a \(B\)coalgebra \((X,\gamma )\) is as follows: we have \(\llbracket \varepsilon \rrbracket = \{ x \in X \mid \pi _1 (\gamma (x)) = 1 \}\) and Open image in new window , where \(\pi _1\) and \(\pi _2\) are the projection maps. The theory map \( th _{}^{\gamma }\) sends a state to the language accepted by that state in the usual way.
Example 2
Again let \(\mathcal {C} = \mathcal {D} = \mathsf {Set}\) and consider the functors \(P= Q= O^{}\), \(B= (O \times )^A\) and \(L = A \times (1 + )\), where A and O are fixed sets, thought of as input and output alphabet, respectively. Then \(B\)coalgebras are Mealy machines and the initial Lalgebra is given by the set \(A^+\) of finite nonempty words over A. For \(X \in \mathcal {C}\), the onestep semantics \(\delta _X:A \times (1 + O^X) \rightarrow O^{BX}\) is defined by \(\delta _X(a,\mathrm {inl(*)}) = \lambda f . \pi _1 (f(a))\) and \(\delta _X(a,\mathrm {inr}(g)) = \lambda f. g(\pi _2(f(a)))\). Concretely, formulas are words in \(A^+\); the (Ovalued) semantics of \(w \in A^+\) at state x is the output \(o \in O\) that is produced after processing the input w from state x.
Example 3
Let \(\mathcal {C} = \mathsf {Set}\) and \(\mathcal {D} = \mathsf {BA}\), where the latter denotes the category of Boolean algebras. Again \(P= 2^{}\), but this time \(2^X\) is interpreted as a Boolean algebra. The functor \(Q\) maps a Boolean algebra to the collection of ultrafilters over it [7]. Furthermore \(B=(\mathcal {P})^A\) where \(\mathcal {P}\) denotes covariant power set and A a set of actions. Coalgebras for this functor correspond to labelled transition systems, where a state has a set of successors that depends on the action/input from A. The dual functor \(L:\mathsf {BA}\rightarrow \mathsf {BA}\) is defined as \(L Y \mathrel {:=} F_{\mathsf {BA}} (\{ \left\langle a \right\rangle y \mid a \in A, y \in Y \}) / \!\equiv \) where \(F_\mathsf {BA}:\mathsf {Set}\rightarrow \mathsf {BA}\) denotes the free Boolean algebra functor and where, roughly speaking, \(\equiv \) is the congruence generated from the axioms \({\left\langle a \right\rangle \perp } \equiv {\perp }\) and \(\left\langle a \right\rangle (y_1 \vee y_2) \mathrel {\equiv } \left\langle a \right\rangle (y_1) \vee \left\langle a \right\rangle (y_2)\) for each \(a \in A\). This is explained in more detail in [21]. The initial algebra for this functor is the socalled LindenbaumTarski algebra [7] of modal formulas \(\left( \phi \mathrel {::=} \perp \mid \phi \vee \phi \mid \lnot \phi \mid \left\langle a \right\rangle \phi \right) \) quotiented by logical equivalence. The definition of an appropriate \(\delta \) can be found in, e.g., [21]—the semantics Open image in new window of a formula then amounts to the standard one [7].
Different types of probabilistic transition systems also fit into the dual adjunction framework, see, e.g, [17].
Subobjects and IntersectionPreserving Functors. We denote by \(\mathsf {Sub}(X)\) the collection of subobjects of an object \(X \in \mathcal {C}\). Let \(\le \) be the order on subobjects \(s :S \rightarrowtail X, s' :S' \rightarrowtail X\) given by \(s \le s'\) iff there is \(m :S \rightarrow S'\) s.t. \(s = s' \circ m\). The intersection \(\bigwedge J \rightarrowtail X\) of a family \(J = \{s_i :S_i \rightarrow X\}_{i \in I}\) is defined as the greatest lower bound w.r.t. the order \(\le \). In a complete category, it can be computed by (wide) pullback. We denote the maps in the limiting cone by \(x_i :\bigwedge J \rightarrowtail S_i\).
For a functor \(B :\mathcal {C} \rightarrow \mathcal {D}\), we say B preserves (wide) intersections if it preserves these wide pullbacks, i.e., if \((B(\bigwedge J), \{B x_i\}_{i \in I})\) is the pullback of \(\{B s_i :B S_i \rightarrow B X\}_{i \in I}\). By [2, Lemma 3.53] (building on [29]), finitary functors on \(\mathsf {Set}\) ‘almost’ preserve wide intersections: for every such functor B there is a functor \(B'\) which preserves wide intersections and agrees with B on all nonempty sets. Finally, if B preserves intersections, then it preserves monos.
Minimality Notions. The algorithm that we will describe in this paper learns a minimal and reachable representation of an object. The intuitive notions of minimality and reachability are formalised as follows.
Definition 4
We call a \(B\)coalgebra \((X,\gamma )\) minimal w.r.t. logical equivalence if the theory map \( th _{}^{\gamma }:X \rightarrow Q\varPhi \) is a monomorphism.
Definition 5
We call a pointed \(B\)coalgebra \((X,\gamma ,x_0)\) reachable if for any subobject \(s:S \rightarrow X\) and \(s_0:1 \rightarrow S\) with \(x_0 = s \circ s_0\): if S is a subcoalgebra of \((X,\gamma )\) then s is an isomorphism.
For expressive logics [27], behavioural equivalence coincides with logical equivalence. Hence, in that case, our algorithm learns a “wellpointed coalgebra” in the terminology of [2], i.e., a pointed coalgebra that is reachable and minimal w.r.t. behavioural equivalence. All logics appearing in this paper are expressive.
4 Subformula Closed Collections of Formulas
Definition 6
A subobject \(j :\varPsi \rightarrow \varPhi \) is called a subformula closed collection (of formulas) if there is a unique Lcoalgebra structure \(\sigma :\varPsi \rightarrow L \varPsi \) such that \((\varPsi , \sigma )\) is a recursive Lcoalgebra and j is the (necessarily unique) coalgebratoalgebra map from \((\varPsi ,\sigma )\) to the initial algebra \((\varPhi ,\alpha )\).
Remark 7
The uniqueness of \(\sigma \) in Definition 6 is implied if L preserves monomorphisms. This is the case in our examples. The notion of recursive coalgebra goes back to [23, 28]. The paper [1] contains a claim that the first item of our definition of subformula closed collection is implied by the second one if L preserves preimages. In our examples both properties of \((\varPsi ,\sigma )\) are verified directly, rather than by relying on general categorical results.
Example 8
In the setting of Example 1, where the initial Lalgebra is based on the set \(A^*\) of words over the set (of inputs) A, a subset \(\varPsi \subseteq A^*\) is subformulaclosed if it is suffixclosed, i.e., if for all \(a w \in \varPsi \) we have \(w \in \varPsi \) as well.
Example 9
In the setting that \(B= (\mathcal {P})^A\) for some set of actions A, \(\mathcal {C} = \mathsf {Set}\) and \(\mathcal {D}=\mathsf {BA}\), the logic is given as a functor L on Boolean algebras as discussed in Example 3. As a subformula closed collection is an object in \(\varPsi \), we are not simply dealing with a set of formulas, but with a Boolean algebra. The connection to the standard notion of being closed under taking subformulas in modal logic [7] can be sketched as follows: given a set \(\varDelta \) of modal formulas that is closed under taking subformulas, we define a Boolean algebra \(\varPsi _\varDelta \subseteq \varPhi \) as the smallest Boolean subalgebra of \(\varPhi \) that is generated by the set \(\hat{\varDelta } = \{ [\phi ]_{\varPhi } \mid \phi \in \varDelta \}\) where for a formula \(\phi \) we let \([\phi ]_\varPhi \in \varPhi \) denote its equivalence class in \(\varPhi \).
It is then not difficult to define a suitable \(\sigma :\varPsi _\varDelta \rightarrow L \varPsi _\varDelta \). As \(\varPsi _\varDelta \) is generated by closing \(\hat{\varDelta }\) under Boolean operations, any two states \(x_1,x_2\) in a given coalgebra \((X,\gamma )\) satisfy Open image in new window . In other words, equivalence w.r.t. \(\varPsi _\varDelta \) coincides with equivalence w.r.t. the set of formulas \(\varDelta \). This explains why in the concrete algorithm, we do not deal with Boolean algebras explicitly, but with subformula closed sets of formulas instead.
Lemma 10
Let \(\varPsi {\mathop {\rightarrowtail }\limits ^{j}} \varPhi \) be a subformula closed collection, with coalgebra structure \(\sigma :\varPsi \rightarrow L\varPsi \). Then \( th _{\varPsi }^{\gamma } = Qj \circ th _{\varPhi }^{\gamma }\) is the unique map making (6) commute. We call \( th _{\varPsi }^{\gamma }\) the \(\varPsi \)theory map, and omit the \(\varPsi \) if it is clear from the context.
5 Reachability and the Base
In this section, we define the notion of base of an endofunctor, taken from [8]. This allows us to speak about the (direct) successors of states in a coalgebra, and about reachability, which are essential ingredients of the learning algorithm.
Definition 11
Let \(B:\mathcal {C} \rightarrow \mathcal {C}\) be an endofunctor. We say \(B\) has a base if for every arrow \(f :X \rightarrow BY\) there exist \(g :X \rightarrow BZ\) and \(m :Z \rightarrowtail Y\) with m a monomorphism such that \(f = Bm \circ g\), and for any pair \(g' :X \rightarrow BZ', m' :Z' \rightarrowtail Y\) with \(Bm' \circ g' = f\) and \(m'\) a monomorphism there is a unique arrow \(h :Z \rightarrow Z'\) such that \(Bh \circ g = g'\) and \(m' \circ h = m\), see Diagram (7). We call (Z, g, m) the (\(B\))base of the morphism f.
For \(B= (\mathcal {P})^A :\mathsf {Set}\rightarrow \mathsf {Set}\), the \(B\)base of \(f :X \rightarrow Y\) is given by the inclusion \(m :Z \rightarrowtail Y\), where \(Z=\{ y \in Y \mid \exists x \in X, \exists a \in A \text{ s.t. } y \in f(x)(a)\}\).
Proposition 12
Suppose \(\mathcal {C}\) is complete and wellpowered, and \(B :\mathcal {C} \rightarrow \mathcal {C}\) preserves (wide) intersections. Then B has a base.
If \(\mathcal {C}\) is a locally presentable category, then it is complete and wellpowered [3, Remark 1.56]. Hence, in that case, any functor \(B :\mathcal {C} \rightarrow \mathcal {C}\) which preserves intersections has a base. The following lemma will be useful in proofs.
Lemma 13
Let \(B:\mathcal {C} \rightarrow \mathcal {C}\) be a functor that has a base and that preserves preimages. Let \(f :S \rightarrow BX\) and \(h :X \rightarrow Y\) be morphisms, let (Z, g, m) be the base of f and let \(e :Z \rightarrow W, m' :W \rightarrow Y\) be the (strong epi, mono)factorisation of \(h \circ m\). Then \((W,Be \circ g, m')\) is the base of \(Bh \circ f\).
The \(B\)base provides an elegant way to relate reachability within a coalgebra to a monotone operator on the (complete) lattice of subobjects of the carrier of the coalgebra. Moreover, we will see that the least subcoalgebra that contains a given subobject of the carrier can be obtained via a standard least fixpoint construction. Finally, we will introduce the notion of prefix closed subobject of a coalgebra, generalising the prefix closedness condition from Angluin’s algorithm.
Definition 14
Let \(B\) be a functor that has a base, \(s :S \rightarrowtail X\) a subobject of some \(X \in \mathcal {C}\) and let \((X,\gamma )\) be a \(B\)coalgebra. Let \((\varGamma (S), g, \varGamma _\gamma ^B(s))\) be the Bbase of \(\gamma \circ s\), see Diagram (8). Whenever \(B\) and \(\gamma \) are clear from the context, we write \(\varGamma (s)\) instead of \(\varGamma _\gamma ^B(s)\).
Lemma 15
Let \(B:\mathcal {C} \rightarrow \mathcal {C}\) be a functor with a base and let \((X,\gamma )\) be a \(B\)coalgebra. The operator \(\varGamma :\mathsf {Sub}(X) \rightarrow \mathsf {Sub}(X)\) defined by \(s \mapsto \varGamma (s)\) is monotone.
Intuitively, \(\varGamma \) computes for a given set of states S the set of “immediate successors”, i.e., the set of states that can be reached by applying \(\gamma \) to an element of S. We will see that prefixpoints of \(\varGamma \) correspond to subcoalgebras. Furthermore, \(\varGamma \) is the key to formulate our notion of closed table in the learning algorithm.
Proposition 16
Let \(s :S \rightarrowtail X\) be a subobject and \((X, \gamma ) \in \mathsf {Coalg}(B)\) for \(X \in \mathcal {C}\) and \(B:\mathcal {C} \rightarrow \mathcal {C}\) a functor that has a base. Then s is a subcoalgebra of \((X, \gamma )\) if and only if \(\varGamma (s) \le s\). Consequently, the collection of subcoalgebras of a given \(B\)coalgebra forms a complete lattice.
Using this connection, reachability of a pointed coalgebra (Definition 5) can be expressed in terms of the least fixpoint \(\mathsf {lfp}\) of an operator defined in terms of \(\varGamma \).
Theorem 17
Let \(B:\mathcal {C} \rightarrow \mathcal {C}\) be a functor that has a base. A pointed Bcoalgebra \((X,\gamma ,x_0)\) is reachable iff \(X \cong \mathsf {lfp}(\varGamma \vee x_0)\) (isomorphic as subobjects of X, i.e., equal).
This justifies defining the reachable part from an initial state \(x_0 :1 \rightarrowtail X\) as the least fixpoint of the monotone operator \(\varGamma \vee x_0\). Standard means of computing the least fixpoint by iterating this operator then give us a way to compute this subcoalgebra. Further, \(\varGamma \) provides a way to generalise the notion of “prefixed closedness” from Angluin’s L\(^*\) algorithm to our categorical setting.
Definition 18
Let \(s_0,s \in \mathsf {Sub}(X)\) for some \(X\in \mathcal {C}\) and let \((X,\gamma )\) be a \(B\)coalgebra. We call s \(s_0\)prefix closed w.r.t. \(\gamma \) if \(s = \bigvee _{i=0}^n s_i\) for some \(n \ge 0\) and a collection \(\{s_i \mid i = 1,\ldots ,n\}\) with \(s_{j + 1} \le \varGamma (\bigvee _{i=0}^j s_i)\) for all j with \(0 \le j < n\).
6 Learning Algorithm

\((S,\hat{\gamma }, \hat{s}_0)\) is correct w.r.t. the collection \(\varPhi \) of all tests, i.e., the theory of \((X,\gamma )\) and \((S,\hat{\gamma })\) coincide on the initial states \(s_0\) and \(\hat{s}_0\), (Definition 25);

\((S,\hat{\gamma },\hat{s}_0)\) is minimal w.r.t. logical equivalence;

\((S,\hat{\gamma },\hat{s}_0)\) is reachable.
The first point means that the learned coalgebra is ‘correct’, that is, it agrees with the coalgebra of the teacher on all possible tests from the initial state. For instance, in case of deterministic automata and their logic in Example 1, this just means that the language of the learned automaton is the correct one.

for any subobject \(S \rightarrowtail X\) and subformula closed subobject \(\varPsi \) of \(\varPhi \), the composite theory map Open image in new window

for \((S, \hat{\gamma }, \hat{s}_0)\) a pointed coalgebra, whether or not it is correct w.r.t. the collection \(\varPhi \) of all tests;

in case of a negative answer to the previous question, a counterexample, which essentially is a subobject \(\varPsi '\) of \(\varPhi \) representing some tests on which the learned coalgebra is wrong (defined more precisely below);

for a given subobject S of X, the ‘next states’; formally, the computation of the Bbase of the composite arrow Open image in new window
The first three points correspond respectively to the standard notions of membership query (‘filling in’ the table with rows S and columns \(\varPsi \)), equivalence query and counterexample generation. The last point, about the base, is more unusual: it does not occur in the standard algorithm, since there a canonical choice of \((X,\gamma )\) is used, which allows to represent next states in a fixed manner. It is required in our setting of an arbitrary coalgebra \((X,\gamma )\).
In the remainder of this section, we describe the abstract learning algorithm and its correctness. First, we describe the basic ingredients needed for the algorithm: tables, closedness, counterexamples and a procedure to close a given table (Sect. 6.1). Based on these notions, the actual algorithm is presented (Sect. 6.2), followed by proofs of correctness and termination (Sect. 6.3).
Assumption 19

that we deal with coalgebras over the base category \(\mathcal {C} = \mathsf {Set}\);

a functor \(B :\mathcal {C} \rightarrow \mathcal {C}\) that preserves preimages and wide intersections;

a category \(\mathcal {D}\) with an initial object 0 s.t. arrows with domain 0 are monic;

a functor \(L :\mathcal {D} \rightarrow \mathcal {D}\) with an initial algebra \(L \varPhi {\mathop {\rightarrow }\limits ^{\cong }} \varPhi \);

an adjunction \(P\dashv Q:\mathcal {C} \leftrightarrows \mathcal {D}^\mathsf {op}\), and a logic \(\delta :L P\Rightarrow PB\).
Moreover, we assume a pointed Bcoalgebra \((X, \gamma , s_0)\).
Remark 20
We restrict to \(\mathcal {C} = \mathsf {Set}\), but see it as a key contribution to state the algorithm in categorical terms: the assumptions cover a wide class of functors on \(\mathsf {Set}\), which is the main direction of generalisation. Further, the categorical approach will enable future generalisations. The assumptions on the category \(\mathcal {C}\) are: it is complete, wellpowered and satisfies that for all (strong) epis \(q:S \rightarrow \overline{S} \in \mathcal {C}\) and all monos \(i :S' \rightarrow S\) such that \(q \circ i\) is mono there is a morphism \(q^{1} :\overline{S} \rightarrow S\) such that (i) \(q \circ q^{1} = \mathsf {id}\) and \(q^{1} \circ q \circ i = i\).
6.1 Tables and Counterexamples
Definition 21
A table is a pair \((S {\mathop {\rightarrowtail }\limits ^{s}} X, \varPsi {\mathop {\rightarrowtail }\limits ^{i}} \varPhi )\) consisting of a subobject s of X and a subformulaclosed subobject i of \(\varPhi \).
Definition 22
A table \((S,\varPsi )\) is closed if there exists a map \(k :\varGamma (S) \rightarrow S\) such that Diagram (9) commutes. A table \((S,\varPsi )\) is sharp if the composite map Open image in new window is monic.
Definition 23
Let \((S,\varPsi )\) be a table. A coalgebra structure \(\hat{\gamma } :S \rightarrow B S\) is called a conjecture (for \((S,\varPsi )\)) if Diagram (10) commutes.
It is essential to be able to construct a conjecture from a closed table. The following, stronger result is a variation of Proposition 16.
Theorem 24
A sharp table is closed iff there exists a conjecture for it. Moreover, if the table is sharp and B preserves monos, then this conjecture is unique.
Definition 25
Let \((S,\varPsi )\) be a table, and let \((S,\hat{\gamma },\hat{s}_0)\) be a pointed Bcoalgebra on S. We say \((S,\hat{\gamma }, \hat{s}_0)\) is correct w.r.t. \(\varPsi \) if Diagram (11) commutes.
All conjectures constructed during the learning algorithm will be correct w.r.t. the subformula closed collection \(\varPsi \) of formulas under consideration.
Lemma 26
Suppose \((S,\varPsi )\) is closed, and \(\hat{\gamma }\) is a conjecture. Then \( th _{\varPsi }^{\gamma } \circ s = th _{\varPsi }^{\hat{\gamma }} :S \rightarrow Q\varPsi \). If \(\hat{s}_0 :1 \rightarrow S\) satisfies \(s \circ \hat{s}_0 = s_0\) then \((S,\hat{\gamma },\hat{s}_0)\) is correct w.r.t. \(\varPsi \).
We next define the crucial notion of counterexample to a pointed coalgebra: a subobject \(\varPsi '\) of \(\varPsi \) on which it is ‘incorrect’.
Definition 27
Let \((S,\varPsi )\) be a table, and let \((S,\hat{\gamma },\hat{s}_0)\) be a pointed Bcoalgebra on S. Let \(\varPsi '\) be a subformula closed subobject of \(\varPhi \), such that \(\varPsi \) is a subcoalgebra of \(\varPsi '\). We say \(\varPsi '\) is a counterexample (for \((S,\hat{\gamma },\hat{s}_0)\), extending \(\varPsi \)) if \((S,\hat{\gamma },\hat{s}_0)\) is not correct w.r.t. \(\varPsi '\).
The following elementary lemma states that if there are no more counterexamples for a coalgebra, then it is correct w.r.t. the object \(\varPhi \) of all formulas.
Lemma 28
Let \((S,\varPsi )\) be a table, and let \((S,\hat{\gamma },\hat{s}_0)\) be a pointed Bcoalgebra on S. Suppose that there are no counterexamples for \((S,\hat{\gamma },\hat{s}_0)\) extending \(\varPsi \). Then \((S,\hat{\gamma },\hat{s}_0)\) is correct w.r.t. \(\varPhi \).
The following describes, for a given table, how to extend it with the successors (in X) of all states in S. As we will see below, by repeatedly applying this construction, one eventually obtains a closed table.
Definition 29
We define \(\mathsf {close}(S,\varPsi ) ~{:=}~ \{ \overline{s} :\overline{S} \rightarrowtail X \mid th _{}^{\gamma } \circ \overline{s} = r, s \le \overline{s} \le s \vee \varGamma (s) \} \). For each \(\overline{s} \in \mathsf {close}(S,\varPsi )\) we have \(s \le \overline{s}\) and thus \(s = \overline{s} \circ \kappa \) for some \(\kappa :S \rightarrow \overline{S}\).
Lemma 30
In Definition 29, for each \(\overline{s} \in \mathsf {close}(S,\varPsi )\), we have \(\kappa = q \circ inl _{\vee }\).
We will refer to \(\kappa =q \circ inl _{\vee }\) as the connecting map from s to \(\overline{s}\).
Lemma 31
In Definition 29, if there exists \(q^{1} :\overline{S} \rightarrow S \vee \varGamma (S)\) such that \(q \circ q^{1} = \mathsf {id}\) and \(q^{1} \circ q \circ inl _{\vee }= inl _{\vee }\), then \(\mathsf {close}(S,\varPsi )\) is nonempty.
By our assumptions, the hypothesis of Lemma 31 is satisfied (Remark 20), hence \(\mathsf {close}(S,\varPsi )\) is nonempty. It is precisely (and only) at this point that we need the strong condition about existence of right inverses to epimorphisms.
6.2 The Algorithm
6.3 Correctness and Termination
Correctness is stated in Theorem 33. It relies on establishing loop invariants:
Theorem 32
The following is an invariant of both loops in Algorithm 1 in Sect. 6.2: 1. \((S,\varPsi )\) is sharp, 2. \(s \circ \hat{s}_0 = s_0\), and 3. s is \(s_0\)prefix closed w.r.t. \(\gamma \).
Theorem 33
If Algorithm 1 in Sect. 6.2 terminates, then it returns a pointed coalgebra \((S,\hat{\gamma },\hat{s}_0)\) which is minimal w.r.t. logical equivalence, reachable and correct w.r.t. \(\varPhi \).
In our termination arguments, we have to make an assumption about the coalgebra which is to be learned. It does not need to be finite itself, but it should be finite up to logical equivalence—in the case of deterministic automata, for instance, this means the teacher has a (possibly infinite) automaton representing a regular language. To speak about this precisely, let \(\varPsi \) be a subobject of \(\varPhi \). We take a (strong epi, mono)factorisation of the theory map, i.e., Open image in new window for some strong epi e and mono m. We call the object \(\vert X \vert _{\varPsi }\) in the middle the \(\varPsi \)logical quotient. For the termination result (Theorem 37), \(\vert X \vert _{\varPhi }\) is assumed to have finitely many quotients and subobjects, which just amounts to finiteness, in \(\mathsf {Set}\).
We start with termination of the inner while loop (Corollary 36). This relies on two results: first, that once the connecting map \(\kappa \) is an iso, the table is closed, and second, that—under a suitable assumption on the coalgebra \((X,\gamma )\)—during execution of the inner while loop, the map \(\kappa \) will eventually be an iso.
Theorem 34
Let \((S, \varPsi )\) be a sharp table, let \(\overline{S} \in \mathsf {close}(S,\varPsi )\) and let \(\kappa :S \rightarrow \overline{S}\) be the connecting map. If \(\kappa \) is an isomorphism, then \((S,\varPsi )\) is closed.
Lemma 35
Consider a sequence of sharp tables \((S_i {\mathop {\rightarrowtail }\limits ^{s_i}} X,\varPsi )_{i \in \mathbb {N}}\) such that \(s_{i+1} \in \mathsf {close}(S_i, \varPsi )\) for all i. Moreover, let \((\kappa _i :S_i \rightarrow S_{i+1})_{i \in \mathbb {N}}\) be the connecting maps (Definition 29). If the logical quotient \(\vert X \vert _{\varPhi }\) of X has finitely many subobjects, then \(\kappa _i\) is an isomorphism for some \(i \in \mathbb {N}\).
Corollary 36
If the \(\varPhi \)logical quotient \(\vert X \vert _{\varPhi }\) has finitely many subobjects, then the inner while loop of Algorithm 1 terminates.
For the outer loop, we assume that \(\vert X \vert _{\varPhi }\) has finitely many quotients, ensuring that every sequence of counterexamples proposed by the teacher is finite.
Theorem 37
If the \(\varPhi \)logical quotient \(\vert X \vert _{\varPhi }\) has finitely many quotients and finitely many subobjects, then Algorithm 1 terminates.
7 Future Work
We showed how duality plays a natural role in automata learning, through the central connection between states and tests. Based on this foundation, we proved correctness and termination of an abstract algorithm for coalgebra learning. The generality is not so much in the base category (which, for the algorithm, we take to be \(\mathsf {Set}\)) but rather in the functor used; we only require a few mild conditions on the functor, and make no assumptions about its shape. The approach is thus considered coalgebra learning rather than automata learning.
Returning to automata, an interesting direction is to extend the present work to cover learning of, e.g., nondeterministic or alternating automata [5, 9] for a regular language. This would require explicitly handling branching in the type of coalgebra. One promising direction would be to incorporate the forgetful logics of [19], which are defined within the same framework of coalgebraic logic as the current work. It is not difficult to define in this setting what it means for a table to be closed ‘up to the branching part’, stating, e.g., that even though the table is not closed, all the successors of rows are present as combinations of other rows.
Another approach would be to integrate monads into our framework, which are also used to handle branching within the theory of coalgebras [16]. It is an intriguing question whether the current approach, which allows to move beyond automatalike examples, can be combined with the CALF framework [13], which is very far in handling branching occurring in various kinds of automata.
Footnotes
Notes
Acknowledgments
We are grateful to Joshua Moerman, Nick Bezhanishvili, Gerco van Heerdt, Aleks Kissinger and Stefan Milius for valuable discussions and suggestions.
References
 1.Adámek, J., Lücke, D., Milius, S.: Recursive coalgebras of finitary functors. ITA 41(4), 447–462 (2007)MathSciNetzbMATHGoogle Scholar
 2.Adámek, J., Milius, S., Moss, L.S., Sousa, L.: Wellpointed coalgebras. Logical Methods Comput. Sci. 9(3) (2013)Google Scholar
 3.Adámek, J., Rosický, J.: Locally Presentable and Accessible Categories. Cambridge Tracts in Mathematics. Cambridge University Press, Cambridge (1994)CrossRefGoogle Scholar
 4.Angluin, D.: Learning regular sets from queries and counterexamples. Inf. Comput. 75(2), 87–106 (1987)MathSciNetCrossRefGoogle Scholar
 5.Angluin, D., Eisenstat, S., Fisman, D.: Learning regular languages via alternating automata. In: Yang, Q., Wooldridge, M. (eds.) IJCAI 2015, pp. 3308–3314. AAAI Press (2015)Google Scholar
 6.Barlocco, S., Kupke, C.: Angluin learning via logic. In: Artemov, S., Nerode, A. (eds.) LFCS 2018. LNCS, vol. 10703, pp. 72–90. Springer, Cham (2018). https://doi.org/10.1007/9783319720562_5CrossRefGoogle Scholar
 7.Blackburn, P., de Rijke, M., Venema, Y.: Modal Logic. Cambridge Tracts in Theoretical Computer Science, vol. 53. Cambridge University Press, Cambridge (2001)CrossRefGoogle Scholar
 8.Blok, A.: Interaction, observation and denotation. Master’s thesis, ILLC Amsterdam (2012)Google Scholar
 9.Bollig, B., Habermehl, P., Kern, C., Leucker, M.: Angluinstyle learning of NFA. In: Boutilier, C. (ed.) Proceedings of the 21st International Joint Conference on Artificial Intelligence, IJCAI 2009, pp. 1004–1009 (2009)Google Scholar
 10.Borceux, F.: Handbook of Categorical Algebra. Encyclopedia of Mathematics and its Applications, vol. 1. Cambridge University Press, Cambridge (1994)CrossRefGoogle Scholar
 11.van Heerdt, G.: An abstract automata learning framework. Master’s thesis, Radboud Universiteit Nijmegen (2016)Google Scholar
 12.van Heerdt, G., Sammartino, M., Silva, A.: CALF: categorical automata learning framework. In: Goranko, V., Dam, M. (eds.) 26th EACSL Annual Conference on Computer Science Logic, CSL 2017. LIPIcs, vol. 2, pp. 29:1–29:24. Schloss Dagstuhl  LeibnizZentrum fuer Informatik (2017)Google Scholar
 13.van Heerdt, G., Sammartino, M., Silva, A.: Learning automata with sideeffects. CoRR, abs/1704.08055 (2017)Google Scholar
 14.Jacobs, B.: Introduction to Coalgebra: Towards Mathematics of States and Observation. Cambridge Tracts in Theoretical Computer Science, vol. 59. Cambridge University Press, Cambridge (2016)zbMATHGoogle Scholar
 15.Jacobs, B., Silva, A.: Automata learning: a categorical perspective. In: van Breugel, F., Kashefi, E., Palamidessi, C., Rutten, J. (eds.) Panangaden Festschrift. LNCS, vol. 8464, pp. 384–406. Springer, Cham (2014). https://doi.org/10.1007/9783319068800_20CrossRefGoogle Scholar
 16.Jacobs, B., Silva, A., Sokolova, A.: Trace semantics via determinization. J. Comput. Syst. Sci. 81(5), 859–879 (2015)MathSciNetCrossRefGoogle Scholar
 17.Jacobs, B., Sokolova, A.: Exemplaric expressivity of modal logics. J. Logic Comput. 20(5), 1041–1068 (2009)MathSciNetCrossRefGoogle Scholar
 18.Klin, B.: Coalgebraic modal logic beyond sets. Electr. Notes Theor. Comput. Sci. 173, 177–201 (2007)CrossRefGoogle Scholar
 19.Klin, B., Rot, J.: Coalgebraic trace semantics via forgetful logics. Logical Methods Comput. Sci. 12(4) (2016)Google Scholar
 20.Kupke, C., Kurz, A., Pattinson, D.: Algebraic semantics for coalgebraic logics. Electr. Notes Theor. Comput. Sci. 106, 219–241 (2004)MathSciNetCrossRefGoogle Scholar
 21.Kupke, C., Pattinson, D.: Coalgebraic semantics of modal logics: an overview. Theor. Comput. Sci. 412(38), 5070–5094 (2011)MathSciNetCrossRefGoogle Scholar
 22.Maler, O., Pnueli, A.: On the learnability of infinitary regular sets. Inf. Comput. 118(2), 316–326 (1995)MathSciNetCrossRefGoogle Scholar
 23.Osius, G.: Categorical set theory: a characterization of the category of sets. J. Pure Appl. Algebra 4(1), 79–119 (1974)MathSciNetCrossRefGoogle Scholar
 24.Pavlovic, D., Mislove, M., Worrell, J.B.: Testing semantics: connecting processes and process logics. In: Johnson, M., Vene, V. (eds.) AMAST 2006. LNCS, vol. 4019, pp. 308–322. Springer, Heidelberg (2006). https://doi.org/10.1007/11784180_24CrossRefzbMATHGoogle Scholar
 25.Rutten, J.J.M.M.: Automata and coinduction (an exercise in coalgebra). In: Sangiorgi, D., de Simone, R. (eds.) CONCUR 1998. LNCS, vol. 1466, pp. 194–218. Springer, Heidelberg (1998). https://doi.org/10.1007/BFb0055624CrossRefGoogle Scholar
 26.Rutten, J.J.M.M.: Universal coalgebra: a theory of systems. Theor. Comput. Sci. 249(1), 3–80 (2000)MathSciNetCrossRefGoogle Scholar
 27.Schröder, L.: Expressivity of coalgebraic modal logic: the limits and beyond. Theor. Comput. Sci. 390(2–3), 230–247 (2008)MathSciNetCrossRefGoogle Scholar
 28.Taylor, P.: Practical Foundations of Mathematics. Cambridge University Press, Cambridge (1999)zbMATHGoogle Scholar
 29.Trnková, V.: On descriptive classification of setfunctors. I. Comment. Math. Univ. Carolinae 12(1), 143–174 (1971)MathSciNetzbMATHGoogle Scholar
 30.Vaandrager, F.W.: Model learning. Commun. ACM 60(2), 86–95 (2017)CrossRefGoogle Scholar
Copyright information
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.