IJCAR 2020: Automated Reasoning pp 518-533

# Mechanised Modal Model Theory

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 12166)

## Abstract

In this paper, we discuss the mechanisation of some fundamental propositional modal model theory. The focus on models is novel: previous work in mechanisations of modal logic have centered on proof systems and applications in model-checking. We have mechanised a number of fundamental results from the first two chapters of a standard textbook (Blackburn et al. ). Among others, one important result, the Van Benthem characterisation theorem, characterises the connection between modal logic and first order logic. This latter captures the desired saturation property of ultraproduct models on countably incomplete ultrafilters.

## 1 Introduction

The theory of modal logic has long been a fruitful area when it comes to mechanisation. The proof systems are appealing, and the applications in model-checking are of clear real-world interest. It helps also that the subject domain (proof calculi and automata) are well-suited to “standard” theorem-proving technology (rule inductions and interesting data types).

There has been much less work on the model theory behind modal logic; indeed even in first order logic, most developments concern themselves only with model theory inasmuch as it is required to show completeness of an accompanying proof system. As our experience demonstrates, it is also clear that modern theorem-proving systems are not necessarily so well-suited to the mathematics behind model theory. Harrison  complained in 1998 that the very notion of validity is awkward to capture in HOL, and our own work shows up further failings in simple type theory.

Nonetheless, there is much interesting mathematics to be found even in the early chapters of a standard text such as Blackburn et al. . The fact that mechanising only as far as [1, Chapter 2] requires what we believe to be the first mechanisation of the notion of ultraproduct (ultimately leading to Łoś’s theorems and other results), is a strong suggestion that we are exploring novel mathematical ground for interactive theorem-proving systems.

Contributions. This paper presents the first mechanised proofs of a number of basic results from the first two chapters of Blackburn et al.  (e.g., bounded morphisms, bisimulations and the finite model property via selection), as well as
• Two versions of Łoś’s theorem on the saturation of ultraproduct models;

• modal equivalence as bisimilarity between ultrafilter extensions; and

• a close approximation of Van Benthem’s Characterisation Theorem.

We also discuss where HOL’s simple type theory lets us down: some standard results (including the best possible statement of Van Benthem’s Characterisation Theorem) seem impossible to prove in our setting.

HOL4 Notation. All of our theorems have been pretty-printed to Open image in new window from the HOL theory files. We hope that most of the basic syntax is easy to follow. In a few places we use Open image in new window to denote the arbitrary choice of an element from set Open image in new window (appealing to the Axiom of Choice). The power-set of a set Open image in new window is written Open image in new window . In a number of places, we use HOL’s “itself” type to allow us to explicitly mention a type via a term. The type Open image in new window has just one value inhabiting it for any given choice of Open image in new window ; that value is written Open image in new window .

Source Availability. Our HOL4 sources are available from GitHub at https://github.com/u5943321/Modal-Logic

The sources build under HOL4 commit with SHA 03829d8986f.

## 2 Syntax, Semantics and the Standard Translation

In our mechanisation, we consider the basic modal language, in which the only primitive modal operator is the ‘$$\Diamond$$’. A modal formula is either of form Open image in new window , where p is of type Open image in new window , enumerating all the possible variable symbols, a disjunction Open image in new window (pretty-printed to Open image in new window in most places), the falsity Open image in new window , a negation Open image in new window , or, finally, of the form Open image in new window . We define a data type called Open image in new window to represent the formulas of this modal language.

If we wanted to consider modal operators with any arity, we should change the last constructor of modal formulas so it takes two parameters: a natural number indexing the modal operator, and a list of modal formulas. This would in turn require a well-formedness predicate to be defined over formulas to make sure that modalities were applied to the right number of arguments.

A model where these formulas can be interpreted consists of a frame and a valuation, where a Open image in new window is a Open image in new window -set with a relation on it, and a model adds valuations for the variables present at each world:

### Definition 3

By requiring Open image in new window in various clauses above, we ensure that models’ world sets must be non-empty if they are to satisfy any formulas.

### Definition 4 (Notions of equivalence)

We cannot omit the type parameter Open image in new window in the definition, as there would otherwise be a type, namely the type of the underlying set of the models we are talking about, that only appears on the right-hand side but not on the left-hand side of the definition. HOL forbids such definitions for soundness reasons. Also, HOL does not permit quantification over types, so it is impossible to write Open image in new window , with $$\mu$$ a type. Therefore, this definition is not exactly encoding the equivalence in the usual sense: when we mention equivalence of formulas in usual mathematical language, we are implicitly referring to the class of all models, but the constraint here bans us from talking about all models of all possible types at once.

### Definition 5

As one would expect, we translate Open image in new window into an existential formula. To ensure we use a fresh variable, we use $$x+1$$ as our new variable symbol in this clause. The standard translation gives a first-order reformulation of satisfaction of modal formulas:

## 3 Basic Results

We discuss some highlights of mechanised results from Blackburn et al. [1, §2.1–§2.3] below.

### 3.1 Tree-Like Property

A tree-like model is a model whose underlying frame is a tree. If Open image in new window , a frame, is also a tree with root r, we write Open image in new window :

The tree-like property says each satisfiable modal formula can be satisfied in a tree-like model:

### Proposition 2

The world set of the tree-like model constructed from Open image in new window is a set of lists of worlds in Open image in new window (such lists are effectively paths from the root to various positions within the tree). Thus, passing to a tree-like model does not preserve the model type. The tree-like lemma is used to prove the finite model property via selection afterwards.

### 3.2 Bisimulation

Though apparently verbose, the definition of bisimulation in HOL is straightforward.

### Definition 7

It is trivial to prove by induction that bisimilar worlds are modal equivalent. As the most significant theorem on the basic theory of bisimulations, we proved the Hennessy-Milner theorem, which states that modal equivalence and bisimulation on image-finite models are the same thing. An image-finite model is a model where every world can only be related to finitely many worlds. In HOL, we get:

### Theorem 1

Bisimulation is an interesting topic in modal logic. Three other significant theorems on bisimulations (including an approximation of Van Benthem Characterisation theorem) are discussed later.

### 3.3 Finite Model Property

The proof of Lemma 1 further relies on the following fact: Given a set Open image in new window of modal-formulas that is finite up to equivalence, if we combine the elements of Open image in new window using only connectives other than Open image in new window , then we get only finitely many non-equivalent formulas. To show this, we prove that there is an injection from the set of equivalence classes of such combinations to a finite set. For the antecedent of Lemma 1, we require the assumption that the universe of $$\beta$$ is infinite since we rely on the fact that two modal formulas Open image in new window and Open image in new window are equivalent if and only if Open image in new window and Open image in new window are equivalent. This would be easy to prove in set theory. However, in simple type theory, the proof of Open image in new window iff Open image in new window requires us (in the right-to-left direction) to be able to construct a model with a new world inserted, something only sure to be possible if the Open image in new window universe is infinite. As the construction used Proposition 2, we change the type of the model by passing to a finite model via selection:

We also mechanised the filtration approach, but omit the details for lack of space. The advantage of filtration is that the resulting finite model is over worlds of the same type as in the starting model.

All the results proved above can be captured using Open image in new window models everywhere. If one takes Open image in new window to be Open image in new window (or any infinite type) in Theorem 2, one can also exploit the fact that numbers and lists of numbers have the same cardinality to derive a finite model result that preserves the “input type”.

## 4 Mechanising Ultrafilters and Ultraproducts

A number of results in Blackburn et al. [1, §2.5–§2.7] rely on theorems about ultrafilters and ultraproducts.

### 4.1 Ultrafilters

Given a non-empty set J, a set Open image in new window is a filter if it contains J itself, is closed under binary intersection, and is closed upward.

### Definition 8

We call L a proper filter if L is not the whole power set. An ultrafilter is a filter U such that for every $$X\subseteq J$$, exactly one of X or $$J\setminus X$$ is in U. Intuitively, subsets Open image in new window in an ultrafilter Open image in new window are considered as ‘large’ subsets of Open image in new window .

The ultrafilter theorem states that every proper filter is contained in an ultrafilter:

### Theorem 3

(The proof uses Zorn’s Lemma.)

A subset A of the power set on J has finite intersection property if once we take the intersection of a finite, nonempty family in A, the resultant set is nonempty.

As a corollary of ultrafilter theorem, a set with finite intersection property is contained in an ultrafilter.

### 4.2 Ultraproducts

The notion of ultraproducts is defined for sets, modal models, and first-order models.

### Definition 11

We write Open image in new window to denote the equivalence class that f belongs to. In the case where Open image in new window for all $$j\in J$$, the ultraproduct is called the ultrapower of Open image in new window modulo U.

Ultraproduct for Modal Models. Given a family Open image in new window of modal models indexed by J and an ultrafilter U on J, the ultraproduct model of Open image in new window modulo U (notation: Open image in new window ) is described as follows:

As Open image in new window is an equivalence relation, if one element in an equivalence class satisfies the required condition, then all the elements in the equivalence class will satisfy the condition. Therefore, if we replace all the existential quantifiers with universal quantifiers in the above definition, the construction is still valid, and will give the same model as the current definition.

The critical result we need about ultraproducts of modal models is a modal version of the fundamental theorem of ultraproducts, also known as Łoś’s theorem.

### Theorem 4 (Łoś’s theorem, Modal version)

According to our intuition about ultrafilters, we can gloss this theorem to mean that the ultraproduct of a family of modal models satisfies a modal formula if and only if ‘most of’ the models in the family satisfy the formula. Though it is possible to derive this result from Łoś’s theorem of first-order models using the standard translation, our proof is direct, by structural induction on Open image in new window .

Ultraproducts for First-Order Models. Given a family Open image in new window of first-order models indexed by J and an ultrafilter U on J, the ultraproduct model of Open image in new window modulo U (notation: Open image in new window ) is given by:

## 5 Ultrafilter Extensions

The first application of the theory of ultrafilters above is to construct the ultrafilter extension of a model, which has the nice property of being modally saturated (m-saturated hereafter). To define m-saturation, we give the following three definitions (the first two are called finitely satisfiable, satisfiable) consecutively:

### Definition 14

For m-saturated models, bisimulation and modal equivalence coincide:

In HOL:

### Definition 15

Using the ultrafilter theorem and some basic properties about ultrafilters, we derive:

### Proposition 4

In particular, every world Open image in new window is embedded as the principal filter Open image in new window on Open image in new window generated by Open image in new window in the ultrafilter extension or Open image in new window . Also, the above leads to the proof of the fact that the ultrafilter extension of every model is m-saturated. The m-saturatedness of ultrafilter extensions together with Proposition 3 immediately gives the central result about ultrafilter extension: bisimilarity of worlds in a model Open image in new window can be characterised as bisimilarity in Open image in new window .

## 6 Countable Saturatedness of Ultrapower Models

Given a first-order model Open image in new window with no information about interpretation of its function symbols, we can expand the model Open image in new window by adding an interpretation of some function symbols. For our purpose, we are only interested in adding the interpretation of finitely many nullary function symbols, also called constants. We write Open image in new window to denote the model that is the result of adding each element in A to Open image in new window as a new constant. Further, the function f is a bijection between $$\{0,\cdots ,n-1\}$$ and A, which is assumed to be finite, so that each nullary function symbol c will be interpreted as Open image in new window in Open image in new window .

### Definition 16

As is apparent from the definition, the only difference between a model and its expansion is the interpretation of function symbols.

### Definition 17

We say Open image in new window is countably saturated if Open image in new window is n-saturated for every natural number n. The ultimate goal is to prove a lemma to be used in the proof of Van Benthem characterisation theorem: For a family of non-empty models, their ultraproduct on a countably incomplete ultrafilter is countably saturated.

Here a countably incomplete ultrafilter is an ultrafilter that contains a countably infinite family that intersects to the empty set. We prove in HOL that such ultrafilters do exist using Theorem 3. The above theorem is not simply a direct consequence of Łoś’s theorem: that result is about ultraproducts of first-order models, and it says nothing about expansion. But to prove Lemma 2, we must prove a statement for an expanded first-order model, and this first order model is itself obtained by converting a ultraproduct of modal models.

To deal with this issue, the key observation is that constants are nothing more than forcing some symbols to be sent to some points in a model under every valuation, hence rather than use nullary function symbols, we fix a set of variable letters, each corresponding to a function symbol, and only consider the valuations that send these variable letters to certain fixed points. With this idea, we can remove all the constants in a formula, and hence change our scope from an expanded model back to the unexpanded model. To get rid of the constants $$\{0,\cdots , n-1\}$$, we replace every Open image in new window with Open image in new window , and replace every constant Open image in new window by Open image in new window . This operation is done by the function Open image in new window which takes a natural number (the number of constants we want to remove), and a first-order formula (where the only function symbols may appear are the constants $$0,\cdots , n-1$$). Since $$0,\cdots ,n-1$$ in a shifted formula are now designed to be sent to fixed places $$f \ 0,\cdots ,f \ (n-1)$$, it does not make sense to assign these variable symbols anywhere else. Therefore, to talk about evaluation of shifted formula, the first thing is to make sure that the valuations we are considering send the variables which actually denote constants to the right place. Hence we shift the valuations accordingly, and then prove that a formula is satisfied on an expanded model is satisfied under a valuation if and only if the shifted formula is satisfied under the shifted valuation. Also, we prove that ‘taking the ultraproduct first-order model commutes with the convertion from modal to a first-order model on certain formulas’, in the sense that the resulting models satisfies the same first-order formulas without function symbols. By putting these two results together, we prove Lemma 2 using the proof in Chang and Keisler .

## 7 Van Benthem’s Characterisation Theorem

Note that the standard translation of any modal formula can only contain unary predicate symbols which correspond to propositional letters, one binary predicate symbol which corresponds to the relation, and no function symbols. A first-order formula which only uses these symbols is called an $${\mathcal L}_{\tau }^1$$-formula. An $${\mathcal L}_{\tau }^1$$-formula which contains only one free variable is called invariant under bisimulation if for all models Open image in new window and Open image in new window with Open image in new window and Open image in new window , if there exists a bisimulation relation between Open image in new window and Open image in new window relating w and v, then $$\phi$$ holds at Open image in new window if and only if it holds at Open image in new window when both Open image in new window and Open image in new window are viewed as first-order models.

### Definition 18

Because of the same problem we met when defining equivalence of formulas, the type parameters are necessary here. However, although it is possible to prove theorems for different types $$\alpha$$ and $$\beta$$ in the above definition, in the theorems to come, we will only consider the case where $$\alpha$$ and $$\beta$$ are the same.

The Van Benthem characterisation theorem says an $$\mathcal L_{\tau }^1$$ formula with at most one free variable x is invariant under bisimulation precisely when it is equivalent to the standard translation of some modal formula at x. It is immediate from Proposition 1 that every such formula which is equivalent to a standard translation is invariant for bisimulation. We cannot prove it as an ‘if and only if’ statement, since according to the proofs in , we can only prove the two directions separately as:

### Proposition 5

which cannot be put together into a double implication. To see the reason: given an $$\mathcal L_{\tau }^1$$-formula $$\phi$$ with no more then one free variable, by the second theorem above, if Open image in new window is invariant under bisimulation for models with Open image in new window -worlds, then $$\phi$$ is equivalent to a standard translation on a model with $$\alpha$$-worlds. However, if we want to prove the converse of this statement, we need to start with the assumption that $$\phi$$ is equivalent to a standard translation on models with $$\alpha$$-worlds, and prove that $$\phi$$ is invariant for bisimulation for models with Open image in new window -worlds. But by the first theorem above, we can only conclude $$\phi$$ is invariant for bisimulation for models of type $$\alpha$$. The point is that it is not the fact that all our desired operations can be taken within a type. In particular, we cannot take ultraproducts of models and preserve cardinalities. The cardinality of the type universe of Open image in new window is too large to be embedded into $$\alpha$$, so we cannot just fix the ‘base type’ to be Open image in new window and get an ‘if and only if’ statement-we cannot derive $$\phi$$ is invariant for bisimulation for models with Open image in new window -worlds from the fact that $$\phi$$ is invariant for bisimulation for models with $$\alpha$$-worlds. If we could quantify over types (as we could in a theorem prover based on dependent type theory), then we can could define ‘invariant under bisimulation for models of every type’, and hence prove the original statement of Van Benthem characterisation theorem.

For the proof of the two theorems above, the first one is immediate from Proposition 1, and the second one requires another critical lemma saying ‘modal equivalence between two worlds implies bisimilarity of the two worlds when embedded in some other models’. More precisely, if two worlds Open image in new window and Open image in new window are modal equivalent, then we can find an ultrafilter U on J such that in ultrapower models of Open image in new window and Open image in new window on U respectively, there is a bisimulation between the worlds corresponding to Open image in new window and Open image in new window .

### Theorem 7

The proof of the above relies on Lemma 2.

## 8 Conclusion

To summarise, we have mechanised all of the results (appearing as propositions, lemmas and theorems) in the first two chapters in Blackburn et al.  that can be captured by the HOL logic, and which are about the basic modal language. The exceptions are:
• The result in Sect. 2.6 about ‘definability’, which requires a definition of the ‘models closed under taking ultraproducts’. Simple type theory cannot capture such large sets.

• The result about ‘safety’ in Sect. 2.7 is a result about the PDL language, which has infinitely many modal operators. For the moment, we have restricted our attention to the basic modal language, with only $$\Diamond$$ (and the derived $$\Box$$).

The two characterisation theorems from Blackburn et al. , namely Theorem 2.68 (Van Benthem’s Characterisation Theorem) and Theorem 2.78, are the only two mechanised theorem such that translating the ‘if and only if’ statements from set theory into simple theory does not yield an ‘if and only if’ statement. Blackburn et al.’s proof of Theorem 2.78 has the same pattern as Van Benthem’s Characterisation Theorem (discussed earlier), and is less complicated.

For each of the mechanised definitions and results, we write the statement in HOL to be as close as possible to the original statement in . We believe that this makes it as easy as possible for people who are interested in mechanising other results in  to continue with our work as a starting point. The work on ultraproducts up to Łoś’s theorem is independent of our work on modal model theory, and should be generally useful in other model-theoretic applications.

### 8.1 Related Work

We believe that we are the first to mechanise the bulk of the results in this paper. Of course, much work has been done in this and similar areas. For example, de Wind’s thesis  is a notable early mechanisation of modal logic, mainly focusing on proving the validity of modal formulas via natural deduction. Of similar vintage is Harrison’s mechanisation of foundational results about first order model theory , in particular compactness. We used this mechanisation directly in our own work. A great deal of work has also been done in the mechanisation of first order proof theory, such as the recent pearl by Blanchette et al. , showing completeness in elegant fashion.

The connections between modal logic and process algebra are well-understood and there has been a great deal of mechanised work on the operational theory of such (co-)algebraic systems, starting at least as far back as Nesi . Our proof of the Hennessy-Milner theorem (Theorem 1) is a gesture in this direction, but Van Benthem’s theorem is much deeper and uses bisimulations as a tool to understanding the connection between modal and first order logics, rather than as a connection to process algebras.

Mechanised work with ultrafilters began with Fleuriot’s use of them to mechanise non-standard analysis . We are unaware of any previous mechanised use of ultraproducts or ultrapowers.

## References

1. 1.
Blackburn, P., de Rijke, M., Venema, Y.: Modal Logic. Cambridge University Press, Cambridge (2001)
2. 2.
Blanchette, J.C., Popescu, A., Traytel, D.: Unified classical logic completeness. In: Demri, S., Kapur, D., Weidenbach, C. (eds.) IJCAR 2014. LNCS (LNAI), vol. 8562, pp. 46–60. Springer, Cham (2014).
3. 3.
Chang, C.C., Keisler, H.J.: Model Theory. North Holland, Amsterdam (1990)
4. 4.
Fleuriot, J.: A Combination of Geometry Theorem Proving and Nonstandard Analysis with Application to Newton’s Principia. Springer, London (2001).
5. 5.
Harrison, J.: Formalizing basic first order model theory. In: Grundy, J., Newey, M. (eds.) TPHOLs 1998. LNCS, vol. 1479, pp. 153–170. Springer, Heidelberg (1998).
6. 6.
Nesi, M.: Mechanising a modal logic for value-passing agents in HOL. Electron. Notes Theor. Comput. Sci. 5, 31–46 (1996).
7. 7.
de Wind, P.: Modal Logic in Coq. Master’s thesis, Vrije Universiteit (2001)Google Scholar