Explainability of Probabilistic Bisimilarity Distances for Labelled Markov Chains

Rady, Amgad; van Breugel, Franck

doi:10.1007/978-3-031-30829-1_14

Amgad Rady⁹ &
Franck van Breugel⁹

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13992))

Included in the following conference series:

International Conference on Foundations of Software Science and Computation Structures

1646 Accesses

Abstract

Probabilistic bisimilarity distances measure the similarity of behaviour of states of a labelled Markov chain. The smaller the distance between two states, the more alike they behave. Their distance is zero if and only if they are probabilistic bisimilar. Recently, algorithms have been developed that can compute probabilistic bisimilarity distances for labelled Markov chains with thousands of states within seconds. However, say we compute that the distance of two states is 0.125. How does one explain that 0.125 captures the similarity of their behaviour?

In this paper, we address this question by returning to the definition of probabilistic bisimilarity distances proposed by Desharnais, Gupta, Jagadeesan, and Panangaden more than two decades ago. We use a slight variation of their logic to construct for each pair of states a sequence of formulas that explains the probabilistic bisimilarity distance of the states. Furthermore, we present an algorithm that computes those formulas and we show that each formula can be computed in polynomial time.

We also prove that our logic is minimal. That is, if we leave out any operator from the logic, then the resulting logic no longer provides a logical characterization of the probabilistic bisimilarity distances.

Supported by the Natural Sciences and Engineering Research Council of Canada.

You have full access to this open access chapter, Download conference paper PDF

Deciding Probabilistic Bisimilarity Distance One for Labelled Markov Chains

The BisimDist Library: Efficient Computation of Bisimilarity Distances for Markovian Models

Continuity Properties of Distances for Markov Processes

1 Introduction

The behavioural equivalence bisimilarity, due to Milner [41] and Park [44], is one of the cornerstones of concurrency theory. It captures which states of a labelled transition system, a simple yet widely used model of concurrent systems, behave the same. Hennessy and Milner [29] provided a logical characterization of bisimilarity by introducing a logic, known as Hennessy-Milner logic, and proving that states are bisimilar if and only if they satisfy the same formulas of the logic. If the labelled transition system has finitely many states then for two states that are not bisimilar there exists a formula, often referred to as a distinguishing formula, such that one state satisfies the formula whereas the other state does not. This formula explains why the two states are not bisimilar. Cleaveland [12] presented a polynomial time algorithm that computes a distinguishing formula for states that are not bisimilar. Consider the following labelled transition system.

The states s and t are not bisimilar. This can be explained by a formula that expresses that a state can transition to a state that can subsequently transition to a purple (square) state as well as a green (hexagon) state. State s satisfies this formula but state t does not.

To model randomness in systems, labelled Markov chains are often used. Larsen and Skou [39] introduced probabilistic bisimilarity to capture which states of a labelled Markov chain behave the same. They also introduced a logic that characterizes probabilistic bisimilarity. Desharnais, Edalat, and Panangaden [19] simplified that logic and presented a polynomial time algorithm that produces a formula that distinguishes two states which are not probabilistic bisimilar. Consider the following labelled Markov chain.

The states s and t are not probabilistic bisimilar. State t can transition with more than probability $\frac{1}{2}$ to a green state that can transition to a purple state, whereas state s cannot. This property can be expressed in the logic, giving rise to a formula that distinguishes the states s and t.

Giacalone, Jou, and Smolka [27] observed that probabilistic bisimilarity is not robust. Miniscule changes to the probabilities may alter which states are probabilistic bisimilar. Instead of an equivalence relation, they suggested exploiting a pseudometric to capture the behavioural similarity of states. That is, each pair of states is assigned a distance, a real number in the interval [0, 1], which measures how similar the states behave. The smaller the distance, the more alike the states behave. Distance zero captures that the states are behaviourally equivalent.

Desharnais, Gupta, Jagadeesan, and Panangaden [20] presented such a pseudometric. They showed that distance zero captures probabilistic bisimilarity. Therefore, those distances are known as probabilistic bisimilarity distances. These distances can be computed in polynomial time, as has been shown by Chen et al. [11]. Tang [48] developed and implemented algorithms that can compute the probabilistic bisimilarity distances for labelled Markov chains with thousands of states within seconds. The states s and t in the above labelled Markov chain have distance 0.125. How does one explain that 0.125 captures the similarity of their behaviour? That is the main question that we address in this paper.

To define their probabilistic bisimilarity distances, Desharnais et al. introduce a logic. The labelled Markov chains that they consider differ slightly from the ones we study in this paper: they label transitions whereas we label states (by colours/shapes), and where we require that the probabilities of the outgoing transitions of a state add up to one, they allow them to sum to less than one as well. State-labelled Markov chains have become the norm in probabilistic model checking. Probabilistic model checkers such as PRISM [38] and Storm [14] consider state-labelled Markov chains. Since each transition-labelled Markov chain can be encoded as a state-labelled one [46], this difference does not substantially impact any of the results. If the probabilities do not sum to one, one can add an additional state and transition to that state with the remaining probability. Also this difference does not significantly change the results. Adjusted to our setting, slightly simplified, and using a different syntax, the logic can be captured by the following grammar:

where a is a label of a state and q is a rational in the interval [0, 1]. This logic characterizes the probabilistic bisimilarity distances (see, for example, [6, 20]). Roughly speaking, the distance of two states is determined by a formula of the logic that distinguishes them the most. Such a formula explains their probabilistic bisimilarity distance. Consider, for example, the states s and t in the above labelled Markov chain. As we already mentioned, their distance is 0.125. This distance can be explained by the formula . This formula captures the probability of reaching a green state in one transition and subsequently reaching a purple state after the second transition. For state s that probability is 0.5 and it is 0.625 for state t. Note that the $\mathord {\bigcirc }$ operator is similar to the next operator of linear temporal logic. Roughly, the interpretation of the formula $\mathord {\bigcirc }\varphi $ in state s is the probability that $\varphi $ holds in the successors of s.

As is common, we provide the above logic with a real-valued interpretation. For a formula of the logic, its interpretation maps each state of the labelled Markov chain to a real value in the interval [0, 1]. For example, for the formula , its interpretation in state s is denoted by and has the value 0.5. The value of is 0.625. Their difference, which is 0.125, is the distance of the states s and t. The distinguishing formula for the states s and t is fairly simple. As we will discuss next, we need all the operators of the logic to explain the probabilistic bisimilarity distances and a single formula may not suffice.

1.1 Main Results

As we will show, the above logic is a minimal logic that characterizes the probabilistic bisimilarity distances. That is, if we remove any operator from the logic then the resulting logic does not characterize the probabilistic bisimilarity distances anymore. Furthermore, we will demonstrate that there exist finite labelled Markov chains for which the distances of some states cannot be explained by a single formula. However, as we will prove, we can explain the probabilistic bisimilarity distances by means of a sequence of formulas. Given two states, say u and v, we will construct a sequence of formulas $\varphi _{uv}^0$, $\varphi _{uv}^1$, $\varphi _{uv}^2$, ... such that the sequence $\llbracket {\varphi _{uv}^0}\rrbracket \!(u) - \!\llbracket {\varphi _{uv}^0}\rrbracket \!(v)$, $\llbracket {\varphi _{uv}^1}\rrbracket \!(u) - \!\llbracket {\varphi _{uv}^1}\rrbracket \!(v)$, $\llbracket {\varphi _{uv}^2}\rrbracket \!(u) - \!\llbracket {\varphi _{uv}^2}\rrbracket \!(v)$, ... converges to the probabilistic bisimilarity distance of u and v. We will also present an algorithm that computes those formulas and we will show that each formula can be computed in polynomial time.

1.2 Related Work

In addition to the references to the literature mentioned above, next we will discuss some other related work. Many of the behavioural equivalences have been characterized logically. For example, Feng and Zhang [25] provide a logical characterization of probabilistic bisimilarity for probabilistic automata. Bernardo and Miculan [4] present an algorithm that builds a distinguishing formula for states of a probabilistic automaton that are not probabilistic bisimilar. König, Mika-Michalski, and Schröder [37] propose a general method to construct a distinguishing formula for a variety of systems, including probabilistic automata.

Behavioural pseudometrics have been introduced for a large variety of systems that model randomness. For example, Ferns, Panangaden, and Precup [26] study probabilistic bisimilarity distances for Markov decision processes, Deng, Chothia, Palamidessi, and Pang [15] introduce them for probabilistic automata, and De Alfaro, Majumdar, Raman, and Stoelinga [1] present them for games.

Also many behavioural pseudometrics have been characterized logically. For example, Desharnais, Laviolette, and Tracol [23] present a logical characterization of $\varepsilon $-bisimilarity, a notion closely related to distances, for probabilistic automata. Du, Deng, and Gebler [24] logically characterize probabilistic bisimilarity distances for probabilistic automata. Pantelic and Lawford [43] provide a logical characterization of a behavioural pseudometric for probabilistic discrete event structures. Komorida et al. [35], König and Mika-Michalski [36], Wild and Schröder [51], as well as Wißmann, Milius, and Schröder [52], present general frameworks to obtain logical characterizations of behavioural pseudometrics.

Whereas many logics for systems with randomness have a real-valued interpretation, Castiglione, Gebler, and Tini [9, 10] introduce a logic for probabilistic automata with a boolean-valued interpretation. Their logic contains an operator with which we can express properties such as “a state can transition with probability a half to a purple state and with probability a half to a green state.” It is this operator that allows them to define a mimicking formula of a state. As the name suggests, this formula mimics the behaviour of the state. Furthermore, they endow the formulas with a pseudometric and show that the probabilistic bisimilarity distance of two states is the distance of their mimicking formulas. Hence, the distance of two states can be explained by means of the mimicking formulas of those states.

2 Labelled Markov Chains and Probabilistic Bisimilarity Distances

In this section, we introduce several key notions that play a central role in the remainder of the paper. We define the model of interest, namely a labelled Markov chain. Furthermore, we introduce probabilistic bisimilarity, an equivalence relation that captures which states of a labelled Markov chain behave the same, and probabilistic bisimilarity distances, which measure the similarity of behaviour of those states.

First, we recall some notions from probability theory. Given a finite set X, a function $\mu : X \rightarrow [0, 1]$ is a probability distribution on X if $\sum _{x \in X} \mu (x) = 1$. We denote the set of probability distributions on X by $\mathcal {D}_{\mathbb {R}}(X)$. For $\mu \in \mathcal {D}_{\mathbb {R}}(X)$ and $A \subseteq X$, we often write $\mu (A)$ for $\sum _{x \in A} \mu (x)$. Similarly, for $\omega \in \mathcal {D}_{\mathbb {R}}(X \times X)$, $a \in X$, and $A \subseteq X$, we usually write $\omega (a, A)$ for $\sum _{x \in A} \omega (a, x)$. For $\mu \in \mathcal {D}_{\mathbb {R}}(X)$, we define the support of $\mu $ by . A probability distribution $\mu \in \mathcal {D}_{\mathbb {R}}(X)$ is rational if $\mu (x) \in \mathbb {Q}$ for all $x \in X$. We denote the set of rational probability distributions on X by $\mathcal {D}_{\mathbb {Q}}(X)$. Obviously, $\mathcal {D}_{\mathbb {Q}}\subseteq \mathcal {D}_{\mathbb {R}}$.

Definition 1

A labelled Markov chain is a tuple $<S, L, \tau , \ell>$ consisting of

a finite set S of states,
a finite set L of labels,
a transition probability function $\tau : S \rightarrow \mathcal {D}_{\mathbb {Q}}(S)$, and
a labelling function $\ell : S \rightarrow L$.

We restrict the transition probabilities to rationals as we will compute with them in Section 6 and 7. For the remainder, we fix a labelled Markov chain $<S, L, \tau , \ell>$. We define probabilistic bisimlarity by means of the set $\varOmega _{\mathbb {R}}(\mu , \nu )$ which is known as the transportation polytope [33] of the probability distributions $\mu $ and $\nu $.

Definition 2

For all $\mu $, $\nu \in \mathcal {D}_{\mathbb {R}}(S)$, the set $\varOmega _{\mathbb {R}}(\mu , \nu )$ is defined by

$$ \varOmega _{\mathbb {R}}(\mu , \nu ) = \{\, \omega \in \mathcal {D}_{\mathbb {R}}(S \times S) \mid \forall s \in S : \omega (s, S) = \mu (s) \wedge \omega (S, s) = \nu (s) \,\}. $$

Definition 3

A relation $R \subseteq S \times S$ is a probabilistic bisimulation if for all $(s, t) \in R$, $\ell (s) = \ell (t)$ and there exists $\omega \in \varOmega _{\mathbb {R}}(\tau (s), \tau (t))$ with $\textrm{support}(\omega ) \subseteq R$. States s and t are probabilistic bisimilar, denoted $s \sim t$, if $(s, t) \in R$ for some probabilistic bisimulation R.

To define the probabilistic bisimilarity distances, it is convenient to partition the set of state pairs into the following three sets.

Definition 4

The sets $S^2_0$, $S^2_1$ and $S^2_?$ are defined by

$$\begin{aligned} S^2_0&= \{\, (s, t) \in S \times S \mid s \sim t \,\}\\ S^2_1&= \{\, (s, t) \in S \times S \mid \ell (s) \not = \ell (t) \,\}\\ S^2_?&= (S \times S) \setminus (S^2_0 \cup S^2_1) \end{aligned}$$

The set $S^2_0$ contains those state pairs that have distance zero (cf. Theorem 6). The set $S^2_1$ contains those state pairs that have a different label and, therefore, have distance one (cf. Definition 5). The set $S^2_?$ contains the remaining state pairs. Note that some of these state pairs may have distance one, but cannot have distance zero. The probabilistic bisimilarity distances are defined in terms of the following function.

Definition 5

The function $\varDelta : (S \times S \rightarrow [0, 1]) \rightarrow (S \times S \rightarrow [0, 1])$ is defined by

$$ \varDelta (d)(s, t) = \left\{ \begin{array}{ll} 0 &{} \textit{if }(s, t) \in S^2_0\\ 1 &{} \textit{if }(s, t) \in S^2_1\\ \displaystyle \inf _{\omega \in \varOmega _{\mathbb {R}}(\tau (s), \tau (t))} \sum _{u, v \in S} \omega (u, v) \; d(u, v) &{} \textit{if }(s, t) \in S^2_? \end{array} \right. $$

Let $d \in S \times S \rightarrow [0, 1]$ and $\omega \in \mathcal {D}_{\mathbb {R}}(S \times S)$. Instead of $\sum _{u, v \in S} \omega (u, v) \; d(u, v)$ we write $\omega \cdot d$ in the remainder to avoid clutter. Similarly, for $f \in S \rightarrow [0, 1]$ and $\mu \in \mathcal {D}_{\mathbb {R}}(S)$ we write $f \cdot \mu $ instead of $\sum _{s \in S} f(s) \, \mu (s)$.

For d, $e \in S \times S \rightarrow [0, 1]$, we define $d \sqsubseteq e$ if for all s, $t \in S$, $d(s, t) \le e(s, t)$. According to, for example, [22, Lemma 3.2], $<S \times S \rightarrow [0, 1], \mathord {\sqsubseteq }>$ is a complete lattice. Since the function $\varDelta $ is a monotone function from a complete lattice to itself, we can conclude from the Knaster-Tarski fixed point theorem (see, for example, [13, Theorem 2.35]) that $\varDelta $ has a least fixed point. We denote this least fixed point by $\delta $. This least fixed point maps each pair of states to a real number in the interval [0, 1]: the probabilistic bisimilarity distance of the states. Distance zero captures probabilistic bisimilarity.

Theorem 6

([21, Theorem 4.10]). For all s, $t \in S$, $\delta (s, t) = 0$ if and only if $s \sim t$.

The probabilistic bisimilarity distance function $\delta $ is the limit of the distance functions $\delta _n$ which only consider the first n transitions when comparing the similarity of the behaviour of states. This result can be seen as an instance of the Kleene fixed point theorem [34].

Definition 7

For each $n \ge 0$, the function $\delta _n : S \times S \rightarrow [0, 1]$ is defined by

$$ \delta _n(s, t) = \left\{ \begin{array}{ll} 0 &{} \textit{if }n=0\\ \varDelta (\delta _{n-1})(s, t) &{} \textit{otherwise.} \end{array} \right. $$

Proposition 8

$\displaystyle \lim _{n \rightarrow \infty } \delta _n = \delta $.

3 A Logical Characterization

Below, we present a logical characterization of the probabilistic bisimilarity distances. We start with a logic very similar to the one introduced by Desharnais et al. [20].

Definition 9

The logic $\mathcal {L}_{\lnot }$ is defined by

where $a \in L$ and $q \in \mathbb {Q} \cap [0, 1]$.

The above logic is slightly different from the one presented in [20] as we consider Markov chains with labelled states, whereas Desharnais et al. studied Markov chains with labelled transitions. In particular, a and $\mathord {\bigcirc }\varphi $ were combined as $<a> \varphi $. Since we restrict our attention to finite state systems, we can restrict ourselves to finite disjunctions. In our setting, the constants true and false can be expressed as $\bigvee _{a \in L} a$ (recall that we assume that the set L is finite as well) and $\lnot \textrm{true}$, respectively. The logic of Desharnais et al. also contains the operator $\lceil \varphi \rceil ^q$ which is redundant, as observed in [21, page 336]. The logic considered by Desharnais [18] lacks negation, but does include $\lceil \varphi \rceil ^q$ and conjunction. The real-valued interpretation of the logic of Desharnais et al., which considers labelled transitions, is adjusted to our setting of labelled states as follows.

Definition 10

The function $\llbracket {\cdot }\rrbracket \! : \mathcal {L}_{\lnot } \rightarrow S \rightarrow [0, 1]$ is defined by

$$\begin{aligned} \begin{array}{rcl} \!\llbracket {a}\rrbracket \!(s) &{} = &{} \left\{ \begin{array}{ll} 1 &{} \textit{if }\ell (s) = a\\ 0 &{} \textit{otherwise} \end{array} \right. \\ \!\llbracket {\mathord {\bigcirc }\varphi }\rrbracket \!(s) &{} = &{} \!\llbracket {\varphi }\rrbracket \! \cdot \tau (s)\\ \!\llbracket {\lnot \varphi }\rrbracket \!(s) &{} = &{} 1 - \!\llbracket {\varphi }\rrbracket \!(s)\\ \!\llbracket {\varphi \ominus q}\rrbracket \!(s) &{} = &{} \max (\!\llbracket {\varphi }\rrbracket \!(s) - q, 0)\\ \!\llbracket {\varphi \vee \psi }\rrbracket \!(s) &{} = &{} \max (\!\llbracket {\varphi }\rrbracket \!(s), \!\llbracket {\psi }\rrbracket \!(s)) \end{array} \end{aligned}$$

Note that $\llbracket {\textrm{false}}\rrbracket \!$ and $\llbracket {\textrm{true}}\rrbracket \!$ are the constant zero and constant one functions, respectively. The probabilistic bisimilarity distances can be characterized in terms of the logic.

Theorem 11

([5, Theorem 40 and 44]). For all s, $t \in S$,

$$ \delta (s, t) = \sup _{\varphi \in \mathcal {L}_{\lnot }} \!\llbracket {\varphi }\rrbracket \!(s) - \!\llbracket {\varphi }\rrbracket \!(t). $$

In the remainder of this paper, we consider the following logic. This logic also characterizes probabilistic bisimilarity distances. As we will show later, this logic can explain the probabilistic bisimilarity distances more concisely than the logic presented above.

Definition 12

The logic $\mathcal {L}$ is defined by

where $a \in L$ and $q \in \mathbb {Q} \cap [0, 1]$.

Note that negation has been removed and conjunction has been added. Also the operator $\oplus q$, which is dual to $\ominus q$, has been added. This logic is very similar to the one considered by Desharnais [18].

Definition 13

The function $\llbracket {\cdot }\rrbracket \! : \mathcal {L} \rightarrow S \rightarrow [0, 1]$ of Definition 10 is modified by

$$ \begin{array}{rcl} \!\llbracket {\varphi \oplus q}\rrbracket \!(s) &{} = &{} \min \{\!\llbracket {\varphi }\rrbracket \!(s) + q, 1\}\\ \!\llbracket {\varphi \wedge \psi }\rrbracket \!(s) &{} = &{} \min \{\!\llbracket {\varphi }\rrbracket \!(s), \!\llbracket {\psi }\rrbracket \!(s)\} \end{array} $$

As already mentioned above, also this logic characterizes the probabilistic bisimilarity distances.

Theorem 14

For all s, $t \in S$, $\delta (s, t) = \sup _{\varphi \in \mathcal {L}} \!\llbracket {\varphi }\rrbracket \!(s) - \!\llbracket {\varphi }\rrbracket \!(t)$.

Proof sketch

Each formula of $\mathcal {L}$ can be rewritten to an equivalent formula of $\mathcal {L}_{\lnot }$. For example, if $\varphi $ is rewritten to $\psi $ then $\varphi \oplus q$ is rewritten to $\lnot (\lnot \psi \ominus q)$. Each formula of $\mathcal {L}$ has a dual: if $\llbracket {\varphi }\rrbracket \! = 1 - \!\llbracket {\psi }\rrbracket \!$ then $\varphi $ is a dual of $\psi $. For example, if $\varphi $ is a dual of $\psi $ then $\varphi \ominus q$ is a dual of $\psi \oplus q$. Each formula $\mathcal {L}_{\lnot }$ can be rewritten to an equivalent formula of $\mathcal {L}$. For example, if $\varphi $ is rewritten to $\psi $ then $\lnot \varphi $ is rewritten to a dual of $\psi $. The result now follows from Theorem 11. $\square $

4 All Operators are Necessary

The logic $\mathcal {L}$ is a minimal logic that characterizes the probabilistic bisimilarity distances. That is, if we remove any operator from the logic then the resulting logic does not characterizes the probabilistic bisimilarity distances anymore. Due to lack of space, we only consider the logic $\mathcal {L}_{\setminus \ominus }$, which does not have the $\ominus q$ operator.

Definition 15

The logic $\mathcal {L}_{\setminus \ominus }$ is defined by

where $a \in L$ and $q \in \mathbb {Q} \cap [0, 1]$.

Theorem 16

There exists a labelled Markov chain $<S, L, \tau , \ell>$ and s, $t \in S$ such that

Proof sketch

Consider the following labelled Markov chain.

It can be shown that $\delta (s, t) = \frac{7}{8}$. Furthermore, we can prove that for all $\varphi \in \mathcal {L}_{\setminus \ominus }$ and $q \in \mathbb {Q} \cap [0, 1]$, if then by structural induction on $\varphi $. Using this result and Theorem 14, we can also show that for all $\varphi \in \mathcal {L}_{\setminus \ominus }$, $\llbracket {\varphi }\rrbracket \!(s) - \!\llbracket {\varphi }\rrbracket \!(t) \le \frac{27}{32}$ by structural induction on $\varphi $. $\square $

5 Explainability

In general, the probabilistic bisimilarity distance of two states cannot be explained by a single formula, as we will show next. That is, generally there does not exist a distinguishing formula for every pair of states of a labelled Markov chain. But, as we will prove below, for every pair of states there exists a sequence of formulas that explains their distance.

Theorem 17

There exists a labelled Markov chain $<S, L, \tau , \ell>$ and s, $t \in S$ such that for all $\varphi \in \mathcal {L}$, .

Proof sketch

Consider the following labelled Markov chain.

It can be shown that $\delta (s, t) = 1$. We can also prove that for all $\varphi \in \mathcal {L}$, by structural induction on $\varphi $. $\square $

As we will show next, for every pair of states (s, t) there exists a sequence of formulas $(\xi _n)_n$ such that $\delta (s, t) = \lim _{n \rightarrow \infty } \!\llbracket {\xi _n}\rrbracket \!(s) - \!\llbracket {\xi _n}\rrbracket \!(t)$. This sequence $(\xi _n)_n$ explains the distance $\delta (s, t)$.

Proposition 18

For all s, $t \in S$ there exists $(\xi _n)_n$ such that

$$ \delta (s, t) = \lim _{n \rightarrow \infty } \!\llbracket {\xi _n}\rrbracket \!(s) - \!\llbracket {\xi _n}\rrbracket \!(t). $$

Proof sketch

This can be concluded from Theorem 14 and the following. Let X be a nonempty subset of $\mathbb {R}$ that is bounded above. Then there exists a sequence $(x_n)_n$ in X that converges to $\sup X$ [8, page 4]. $\square $

The proof of the above proposition is not constructive. Below, we will construct a sequence of formulas $(\varphi _{st}^n)_n$ that explains the distance of the states s and t. In particular, $\varphi _{st}^n$ is constructed so that

and, hence, $\llbracket {\varphi _{st}^n}\rrbracket \!(s) - \!\llbracket {\varphi _{st}^n}\rrbracket \!(t) = \delta _n(s, t)$. That is, the formula $\varphi _{st}^n$ explains the distance $\delta _n(s, t)$.

If $n = 0$ then $\delta _n(s, t) = 0$. We choose the formula false since

Let , also $\delta _n(s, t) = 0$. Again we choose the formula false to explain the distance. For $(s, t) \in S^2_1$, we have that $\delta _n(s, t) = 1$. In this case the formula $\ell (s)$ explains $\delta _n(s, t)$ since $\ell (s) \not = \ell (t)$ and, therefore,

To construct a formula that explains distance $\delta _n(s, t)$ for $(s, t) \in S^2_?$, we rely on the following result about distances and nonexpansive functions. A function $f \in S \rightarrow [0, 1]$ is nonexpansive if for all s, $t \in S$, $| f(s) - f(t) | \le \delta _n(s, t)$. The set of nonexpansive functions is denoted by . This set forms a convex polytope and is known as the Lipschitz polytope. We denote its vertices by .

Proposition 19

For all $(s, t) \in S^2_?$ and $n \ge 0$, there exists such that $\delta _{n+1}(s, t) = f_{st}^{n} \cdot (\tau (s) - \tau (t))$.

Proof sketch

Let $(s, t) \in S^2_?$ and $n \ge 0$. Then

$$ \delta _{n+1}(s, t) = \inf _{\omega \in \varOmega _{\mathbb {R}}(\tau (s), \tau (t))} \omega \cdot \delta _{n}. $$

We can view $\delta _{n+1}(s, t)$ as the minimal cost of a transportation problem, where $\tau (s)(u)$ represents the amount transported from the origin u, $\tau (t)(v)$ captures the amount received at the destination v, $\delta _n(u, v)$ represents the transportation cost from u to v, and each $\omega $ captures a transportation plan, that is, $\omega (u, v)$ is the amount transported from u to v (see, for example, [40, page 15]).

From the Kantorovich-Rubinstein duality theorem [31] we can conclude that

In this dual to the above transportation problem, each f represents a price function (see, for example, [40, page 81]). Since a linear function on a convex polytope attains its maximum at a vertex (see, for example, [49, Theorem 2 of Chapter 1]), we can conclude that

Since we can prove that , there exists such that $\delta _{n+1}(s, t) = f^{n}_{st} \cdot (\tau (s) - \tau (t))$. $\square $

The function $f^n_{st}$ plays a key role in the formula explaining $\delta _n(s, t)$. However, $f^n_{st}$ is not necessarily unique. Consider the following labelled Markov chain.

For this example, the sequence $(\delta _n)_n$ converges in three steps, that is, $\delta = \delta _3$. We have that $\delta _2(u,v) = \frac{1}{2}$ and $\delta _3(s,t) = \frac{1}{2}$. So we need the function $f^2_{st}$ to satisfy $\delta _3(s,t) = f^2_{st}(u) - f^2_{st}(v)$ and $|f^2_{st}(u) - f^2_{st}(v)| \le \frac{1}{2}$. For each $0 \le q \le \frac{1}{2}$, $f^2_{st}(u) = \frac{1}{2} + q$ and $f^2_{st}(v) = q$ satisfies these properties. As we will see, any $f^n_{st}$ that satisfies these properties can be used to construct $\varphi ^n_{st}$. How to compute these functions $f_{st}^n$ is the topic of the next section.

As we will show in Theorem 22, we can construct a formula $\psi ^n_{st}$ that captures the function $f^n_{st}$, that is, $\llbracket {\psi ^n_{st}}\rrbracket \! = f^n_{st}$. More about this soon. By means of $\psi ^{n-1}_{st}$ we can explain the distance $\delta _n(s, t)$ by the formula $(\mathord {\bigcirc }\psi ^{n-1}_{st}) \ominus (f^{n-1}_{st} \cdot \tau (t))$ since we have that

$$\begin{aligned} \!\llbracket {(\mathord {\bigcirc }\psi ^{n-1}_{st}) \ominus (f^{n-1}_{st} \cdot \tau (t))}\rrbracket \!(s)&= \max \{ (\!\llbracket {\psi ^{n-1}_{st}}\rrbracket \! \cdot \tau (s)) - (f^{n-1}_{st} \cdot \tau (t)), 0 \}\\&= \max \{ (f^{n-1}_{st} \cdot \tau (s)) - (f^{n-1}_{st} \cdot \tau (t)), 0 \}\\&= \max \{ f^{n-1}_{st} \cdot (\tau (s) - \tau (t)), 0 \}\\&= \max \{ \delta _n(s, t), 0 \}\\&= \delta _n(s, t) \end{aligned}$$

and, similarly, we can deduce that

Let us return to the formula $\psi ^n_{st}$ that captures the function $f^n_{st}$. To construct $\psi ^n_{st}$ we use the following result.

Lemma 20

([2, Lemma A7.2]). Let $f \in S \rightarrow [0, 1]$. If for all u, $v \in S$, there exists $g_{uv} \in S \rightarrow [0, 1]$ such that $g_{uv}(u) = f(u)$ and $g_{uv}(v) = f(v)$, then

$$ f = \min _{u \in S} \max _{v \in S} g_{uv} = \max _{u \in S} \min _{v \in S} g_{uv}. $$

To apply the above lemma, we need to construct for all u, $v \in S$ a formula $\psi ^n_{stuv}$ such that

The details are provided in Definition 21 and Theorem 22. From Lemma 20 we can then conclude that

The above can be summarized as follows.

Definition 21

For all s, $t \in S$,

$$ \varphi _{st}^0 = \textrm{false} $$

and

$$ \varphi _{st}^1 = \left\{ \begin{array}{ll} \textrm{false} &{} \textit{if }(s, t) \in S^2_0 \cup S^2_?\\ \ell (s) &{} \textit{if }(s, t) \in S^2_1 \end{array} \right. $$

For all s, $t \in S$ and $n \ge 2$,

For all $(s, t) \in S^2_?$ and $n \ge 1$,

$$ \psi ^n_{st} = \bigwedge _{u \in S} \bigvee _{v \in S} \psi ^n_{stuv} $$

For all $(s, t) \in S^2_?$, u, $v \in S$, and $n \ge 1$,

Note that, for $(s, t) \in S^2_?$ and $n \ge 2$, the formula $\varphi _{st}^n$ contains $|S|^2$ subformulas of the form $\varphi _{uv}^{n-1}$. As a consequence, the size of $\varphi _{st}^n$ grows exponentially in n. As we will see in Section 7, we can compute $\varphi _{st}^n$ in polynomial time by sharing subformulas.

The above definition shows some similarities with the sequence of formulas introduced in [43, Definition 8]. Their setting is different: the transitions are labelled (as in [20]), the transition function is deterministic, and the labelling of the transitions is probabilistic. Their logic is simpler than the one introduced in [20] since the systems they consider are simpler. The sequence of formulas that they introduce is syntactically simpler than the one we define above. Their formulas are only used to prove a logical characterization, although those formulas can also be used for explainability.

Consider the states s and t of the following labelled Markov chain.

By definition, $\varphi ^0_{st} = \textrm{false}$ and $\varphi ^1_{st} = \textrm{false}$. For $\varphi ^2_{st}$ we get

This formula can be simplified to $\textrm{false}$. In the logic of Desharnais et al., which lacks $\wedge $ and $\oplus $, one would need 111 additional $\lnot $, making it less concise.

The formula $\varphi ^3_{st}$ fills more than a page, but can be simplified to the formula . Although generally there does not exist a distinguishing formula for each pair of states (Theorem 17), in this case the formula $\varphi ^3_{st}$ explains the distance of states s and t, since $\delta (s, t) = 0.125$, $\llbracket {\varphi ^3_{st}}\rrbracket \!(s) = 0.125$ and $\llbracket {\varphi ^3_{st}}\rrbracket \!(t) = 0$. The formula captures the probability of reaching a green state in one transition and subsequently reaching another green state.

The formula $\varphi ^3_{ts}$ can be simplified to . Since we have that $\llbracket {\varphi ^3_{ts}}\rrbracket \!(t) = 0.125$ and $\llbracket {\varphi ^3_{ts}}\rrbracket \!(s) = 0$, the formula $\varphi _{ts}^3$ explains the distance $\delta (t, s) = 0.125$. The formula represents the probability of reaching a green state in one transition and subsequently reaching a purple state.

The outermost test can be removed from the explanation. Hence, the formulas and explain the distance of states s and t as well.

Theorem 22

(a)
For all s, $t \in S$ and $n \ge 0$, $\llbracket {\varphi ^n_{st}}\rrbracket \!(s) = \delta _n(s, t)$ and $\llbracket {\varphi ^n_{st}}\rrbracket \!(t) = 0$.
(b)
For all $(s, t) \in S^2_?$ and $n \ge 1$, $\llbracket {\psi _{st}^n}\rrbracket \! = f_{st}^{n}$.
(c)
For all $(s, t) \in S^2_?$, u, $v \in S$, and $n \ge 1$, $\llbracket {\psi ^n_{stuv}}\rrbracket \!(u) = f_{st}^{n}(u)$ and $\llbracket {\psi ^n_{stuv}}\rrbracket \!(v) = f_{st}^{n}(v)$.

Proof sketch

This theorem can be proved by induction on n. Most steps of the proof have already been discussed above. To prove (c), let $(s, t) \in S^2_?$, u, $v \in S$ and $n \ge 1$. We need to distinguish three cases. Here we only consider the case that . Then

$\square $

Combining Proposition 8 and Theorem 22, we obtain the following explainability result.

Corollary 23

For all s, $t \in S$, $\lim _{n \rightarrow \infty } \!\llbracket {\varphi _{st}^n}\rrbracket \!(s) - \!\llbracket {\varphi _{st}^n}\rrbracket \!(t) = \delta (s, t)$.

6 Computing $f_{st}^n$

Proposition 19 states that the functions $f_{st}^n$ exist. Below, we will show that these functions can be computed in polynomial time.

Let $(s, t) \in S^2_?$. The function is defined as the constant zero function satisfies $\delta _1(s, t) = f_{st}^0 \cdot (\tau (s) - \tau (t))$ and can be computed in polynomial time. To prove that the remaining functions $f_{st}^n$, with $n \ge 1$, can be computed in polynomial time as well, we use the primal network simplex algorithm to solve minimum-cost flow problems due to Orlin [42] and the ellipsoid method to solve linear programming problems due to Khachiyan [32]. As we will show below, $f_{st}^n$ can be computed as ${\textsc {FindVertex}}(\delta _n, \tau (s), \tau (t))$.

In line 4 we use Orlin’s primal network simplex algorithm to compute the minimum cost for the following network (N, E). The nodes of the network consist of two copies of each $u \in S$, denoted $u_0$ and $u_1$. The supply of node $u_0$ is $\mu (u)$ and the demand of node $u_1$ is $\nu (u)$. Each edge $(u_0, v_1)$ has cost d(u, v).

Each $\omega \in \varOmega _{\mathbb {R}}(\mu , \nu )$ corresponds to a feasible flow, where $\omega (u, v)$ captures the flow from $u_0$ to $v_1$. The constraints $\omega (u, S) = \mu (u)$ and $\omega (S, u) = \nu (u)$, defining $\varOmega _{\mathbb {R}}(\mu , \nu )$, capture that the supply of $u_0$ flows from $u_0$ and the demand of $u_1$ flows to $u_1$. For a feasible flow $\omega $, its cost is $\omega \cdot d$. Hence, $d_{\mu \nu }$ captures the minimum cost.

Note that, by definition, the supplies and demands are rational. We can prove that $d_{\mu \nu } = \omega \cdot d$ for some $\omega \in \varOmega _{\mathbb {Q}}(\mu , \nu )$. Since d is rational as well, we can conclude that $d_{\mu \nu }$ is also rational. Orlin’s primal network simplex algorithm can compute the minimum cost and, hence, can be used to compute $d_{\mu \nu }$. Orlin’s algorithm is strongly polynomial: $O(|N|^2 |E|^2 \log |N|)$. Since there are 2|S| nodes and $|S|^2$ edges, $d_{\mu \nu }$ can be computed in $O(|S|^6 \log |S|)$.

In line 5 we use Khachiyan’s ellipsoid method to find a feasible solution of a linear programming problem with the variables $x_s$, for $s \in S$, and the constraints

$$\begin{aligned} \forall s, t \in S :&\, x_s - x_t \le d(s, t)\\ \forall s \in S :&\, x_s \ge 0\\ \forall s \in S :&\, x_s \le 1\\&\sum _{s \in S} x_s \, (\mu (s) - \nu (s)) = d_{\mu \nu } \end{aligned}$$

By means of the ellipsoid method we can find a vertex of the convex polytope defined by the above constraints. This method is polynomial in the size of the constraints, in this case, the size of d, $\mu $, $\nu $, and $d_{\mu \nu }$.

Let $n \ge 1$ and $(s, t) \in S^2_?$. Since we can show that $\delta _n$ is rational and $\delta _n(s, s)~=~0$ for all $s \in S$, we can apply FindVertex to $\delta _n$, $\tau (s)$ and $\tau (t)$. In this case, line 4 computes $\inf _{\omega \in \varOmega _{\mathbb {R}}(\tau (s), \tau (t))} \omega \cdot \delta _n$, which equals $\delta _{n+1}(s, t)$. As a consequence, ${\textsc {FindVertex}}(\delta _n, \tau (s), \tau (t))$ returns such that $f_{st}^n \cdot (\tau (s) - \tau (t)) = \delta _{n+1}(s, t)$.

As we already observed above, line 4 can be computed in polynomial time in the size of the labelled Markov chain and line 5 can be computed in polynomial time in the size of $\delta _n$, $\tau (s)$, $\tau (t)$, and $\delta _{n+1}(s, t)$, which we can show to be polynomial time in the size of the labelled Markov chain and n. Hence, the running time of ${\textsc {FindVertex}}(\delta _n, \tau (s), \tau (t))$ is polynomial in the size of the labelled Markov chain and n.

7 The Algorithm

Given a labelled Markov chain $<S, L, \tau , \ell>$ and $N \in \mathbb {N}$, we can explain the distances $\delta (s, t)$ for s, $t \in S$ by computing the formulas $\varphi _{st}^n$ for $0 \le n \le N$. To obtain this sequence of formulas, we implement Definition 21 as follows. Below, for s, t, $u \in S$, we use the array cells $\textrm{distance}[s][t]$, $\textrm{function}[s][t][u]$, and $\textrm{formula}[s][t][n]$ to represent the distance $\delta _{n-1}(s, t)$, the function value $f_{st}^{n-1}(u)$, and the formula $\varphi _{st}^n$, respectively. In line 5-17, we compute $\delta _0$, $f_{st}^0$, $\varphi _{st}^0$, and $\varphi _{st}^1$. The loop of line 20–50, first computes the distances $\delta _n$ (line 21–27), then determines the function $f_{st}^n$ (line 30), and finally computes formulas $\varphi _{st}^{n+1}$ (line 31–49).

Let us first discuss the correctness of the above algorithm. In line 4, $\sim $ is computed by deciding probabilistic bisimilarity. The loop spanning line 20–50 has the following invariant.

$$\begin{aligned} \forall s, t \in S : \textrm{distance}[s][t]&= \delta _{n-1}(s, t)\end{aligned}$$

(1)

$$\begin{aligned} \forall (s, t) \in S^2_? : \forall u \in S : \textrm{function}[s][t][u]&= f_{st}^{n-1}(u)\end{aligned}$$

(2)

$$\begin{aligned} \forall s, t \in S : \forall 0 \le i \le n : \textrm{formula}[s][t][i]&= \varphi _{st}^i \end{aligned}$$

(3)

Let us check that the above loop invariant holds when we reach line 21 for the first time. In line 7 we set distance to zero. Hence, (1) is satisfied when we reach line 21. In line 17 we set function to zero. Hence, (2) is also satisfied when we reach line 21. In line 6, 10, 13, and 15 we set formula such that (3) is satisfied when we reach line 21.

Next, we check that the loop maintains the above invariant, that is, if the invariant holds at line 21 then it also holds at line 50. Assume that the invariant holds at line 21. From (2) and line 22–27 we can conclude that

$$ \textrm{distance}[s][t] = \left\{ \begin{array}{ll} 0 &{} \,\textrm{if}\,(s, t) \in S^2_0\\ 1 &{} \,\textrm{if}\,(s, t) \in S^2_1\\ f_{st}^{n-1} \cdot (\tau (s) - \tau (t)) &{} \,\textrm{otherwise} \end{array} \right. $$

once we arrive at line 28. Hence, from Proposition 19 we can conclude that $\textrm{distance}[s][t] = \delta _n(s, t)$ for all s, $t \in S$. Therefore, (1) holds at line 50.

Since $\textrm{distance} = \delta _n$ at line 30 and, as we have seen in Section 6, ${\textsc {FindVertex}}(\delta _n, \tau (s), \tau (t))$ returns $f_{st}^n$, we assign $f_{st}^n$ to $\textrm{function}[s][t]$ in line 30. Hence, (2) holds at line 50. We can also verify that line 31–49 ensure that (3) is maintained by the loop.

Finally, we will argue that the running time of the above algorithm is polynomial in the size of the labelled Markov chain and N. Probabilistic bisimilarity can be decided in polynomial time as was first shown by Baier [3]. More efficient algorithms have been proposed by Buchholz [7], Derisavi, Hermanns, and Sanders [17] and Valmari and Franceschinis [50]. Hence, line 4 is polynomial time.

Each line of 6–17 can be implemented in constant time. Since each line of this part is executed at most $N |S|^3$ times, the running time of line 5–17 is polynomial in the size of the labelled Markov chain and N.

The loop consisting of line 20–50 is executed $N-1$ times. As we already discussed in Section 6, the running time of ${\textsc {FindVertex}}(\delta _n, \tau (s), \tau (t))$ is polynomial in the size of the labelled Markov chain and n. When we arrive at line 30, distance equals $\delta _n$ and, hence, this line is polynomial in the size of the labelled Markov chain and n. All other lines of the loop can be implemented in constant time. Each line is executed at most $|S|^4$ times. Therefore, the running time of line 20–50 is polynomial in the size of the labelled Markov chain and N.

8 Conclusion

In this paper, we study a minor variation of the logic introduced by Desharnais et al. in [20]. In particular, we show that

1.
the logic is a minimal one that characterizes the probabilistic bisimilarity distances,
2.
in general, there does not exist a distinguishing formula $\varphi _{st}$ for states s and t such that $\llbracket {\varphi _{st}}\rrbracket \!(s) - \!\llbracket {\varphi _{st}}\rrbracket \!(t) = \delta (s, t)$,
3.
there exists a sequence $(\varphi _{st}^n)_{n}$ of formulas that explains distance $\delta (s, t)$ as $\lim _{n \rightarrow \infty } \!\llbracket {\varphi _{st}^n}\rrbracket \!(s) - \!\llbracket {\varphi _{st}^n}\rrbracket \!(t) = \delta (s, t)$, and
4.
each formula $\varphi _{st}^n$ can be computed in polynomial time.

As pointed out by Hillerström in [30], an early paper on computing distinguishing formulas, to explain why states are not bisimilar “arguments must be concise in the sense that an argument must not contain redundant or irrelevant information.” This applies to our setting as well. The distinguishing formulas introduced in Definition 21 are in many cases far from concise. We leave the simplification of these formulas for future research.

One may wonder whether adding fixed points to the logic, in the form variables X and either operators $\mu X$ and $\nu X$ or equations of the form $X = \varphi $, would allow us to explain the probabilistic bisimilarity distance of two states by means of a single formula. A logic similar to the one studied in this paper that contains fixed points has been studied by De Alfaro et al. [1]. Whether simply adding fixed points to the logic suffices is not immediately clear as the $\ominus p_n$ and $\oplus q_n$ in the formula $\psi _{uvst}^n$ vary as n varies. Extending the logic so that the probabilistic bisimilarity distance of two states can be explained by means of a single formula is another potential topic for future research.

Graf and Sifakis [28] introduce the notion of a characteristic formula for a state s: a state satisfies this formula if and only if it is behaviourally equivalent to s. Characteristic formulas have been developed for probabilistic bisimilarity. For example, Deng and van Glabbeek [16] present characteristic formulas for probabilistic automata. Sack and Zhang [47] introduce a general framework to construct characteristic formulas for probabilistic automata. In the setting of probabilistic bisimilarity distances, a characteristic formula for a state s of a labelled Markov chain can be formalized in the following ways. The formula $\varphi _s$ is a characteristic formula for the state s if

$$\begin{aligned} \textrm{for all states}~t\textrm{, } \!\llbracket {\varphi _s}\rrbracket \!(s) - \!\llbracket {\varphi _s}\rrbracket \!(t) = \delta (s, t) \end{aligned}$$

(4)

or

$$\begin{aligned} \textrm{for all states}~t\textrm{, } \!\llbracket {\varphi _s}\rrbracket \!(t) = \delta (s, t). \end{aligned}$$

(5)

It can be shown that (4) and (5) are equivalent: if there exists a formula that satisfies (4) then there also exists a (different) formula that satisfies (5). Whether such a formula or a sequence of such formulas exists for the logic studied in this paper is an open question that may be tackled in future research.

A preliminary implementation of the algorithm in Java is available [45]. Improving the code is another avenue for further research.

References

Luca de Alfaro, Rupak Majumdar, Vishwanath Raman, and Mariëlle Stoelinga. Game relations and metrics. In Proceedings of the 22nd Annual IEEE Symposium on Logic in Computer Science, pages 99–108, Wroclaw, Poland, July 2007. IEEE.
Google Scholar
Robert Ash. Real Analysis and Probability. Academic Press, New York, NY, USA, 1972.
Google Scholar
Christel Baier. Polynomial time algorithms for testing probabilistic bisimulation and simulation. In Rajeev Alur and Thomas Henzinger, editors, Proceedings of the 8th International Conference on Computer Aided Verification, volume 1102 of Lecture Notes in Computer Science, pages 50–61, New Brunswick, NJ, USA, July/August 1996. Springer-Verlag.
Google Scholar
Marco Bernardo and Marino Miculan. Constructive logical characterizations of bisimilarity for reactive probabilistic systems. Theoretical Computer Science, 764(11):80–99, April 2019.
Google Scholar
Franck van Breugel, Claudio Hermida, Michael Makkai, and James Worrell. Recursively defined metric spaces without contraction. Theoretical Computer Science, 380(1/2):143–163, June 2007.
Google Scholar
Franck van Breugel and James Worrell. Towards quantitative verification of probabilistic systems. In Fernando Orejas, Paul Spirakis, and Jan van Leeuwen, editors, Proceedings of the 28th International Colloquium on Automata, Languages and Programming, volume 2076 of Lecture Notes in Computer Science, pages 421–432, Crete, Greece, July 2001. Springer-Verlag.
Google Scholar
Peter Buchholz. Efficient computation of equivalent and reduced representations for stochastic automata. Computer Systems Science and Engineering, 15(2):93–103, 2000.
Google Scholar
Neil Carothers. Real analysis. Cambridge University Press, Cambridge, United Kingdom, 2000.
Google Scholar
Valentina Castiglioni, Daniel Gebler, and Simone Tini. Logical characterization of bisimulation metrics. In Mirco Tribastone and Herbert Wiklicky, editors, Proceedings 14th International Workshop Quantitative Aspects of Programming Languages and Systems, volume 227 of Electronic Proceedings in Theoretical Computer Science, pages 44–62, Eindhoven, The Netherlands, April 2016.
Google Scholar
Valentina Castiglioni and Simone Tini. Logical characterization of branching metrics for nondeterministic probabilistic transition systems. Information and Computation, 268, October 2019.
Google Scholar
Di Chen, Franck van Breugel, and James Worrell. On the complexity of computing probabilistic bisimilarity. In Lars Birkedal, editor, Proceedings of the 15th International Conference on Foundations of Software Science and Computational Structures, volume 7213 of Lecture Notes in Computer Science, pages 437–451, Tallinn, Estonia, March/April 2012. Springer-Verlag.
Google Scholar
Rance Cleaveland. On automatically distinguishing inequivalent processes. In Edmund Clarke and Robert Kurshan, editors, Proceedings of a DIMACS Workshop on Computer Aided Verification, volume 3 of DIMACS Series in Discrete Mathematics and Theoretical Computer Science, pages 463–476, New Brunswick, NJ, USA, June 1990. DIMACS/AMS.
Google Scholar
Brian Davey and Hilary Priestley. Introduction to lattices and order. Cambridge University Press, Cambridge, United Kingdom, 2002.
Google Scholar
Christian Dehnert, Sebastian Junges, Joost-Pieter Katoen, and Matthias Volk. A Storm is coming: A modern probabilistic model checker. In Rupak Majumdar and Viktor Kuncak, editors, Proceedings of the 29th International Conference on Computer Aided Verification, volume 10427 of Lecture Notes in Computer Science, pages 592–600, Heidelberg, Germany, July 2017. Springer-Verlag.
Google Scholar
Yuxin Deng, Tom Chothia, Catuscia Palamidessi, and Jun Pang. Metrics for action-labelled quantitative transition systems. In Antonio Cerone and Herbert Wiklicky, editors, Proceedings of 3rd Workshop on Quantitative Aspects of Programming Languages, volume 153(2) of Electronic Notes in Theoretical Computer Science, pages 79–96,Edinburgh, UK, April 2005. Elsevier.
Google Scholar
Yuxin Deng and Rob van Glabbeek. Characterising probabilistic processes logically. In Christian G. Fermüller and Andrei Voronkov, editors, Proceedings of the 17th International Conference on Logic for Programming, Artificial Intelligence, and Reasoning, volume 6397 of Lecture Notes in Computer Science, pages 278–293, Yogyakarta, Indonesia, October 2010.Springer-Verlag.
Google Scholar
Salem Derisavi, Holger Hermanns, and William Sanders. Optimal state-space lumping in Markov chains. Information Processing Letters, 87(6):309–315, September 2003.
Google Scholar
Josée Desharnais. Labelled Markov Processes. PhD thesis, McGill University, Montreal, November 1999.
Google Scholar
Josée Desharnais, Abbas Edalat, and Prakash Panangaden. A logical characterization of bisimulation for labeled Markov processes. In Proceedings of the 13th Annual IEEE Symposium on Logic in Computer Science, pages 478–487, Indianapolis, IN, USA, June 1998. IEEE.
Google Scholar
Josée Desharnais, Vineet Gupta, Radha Jagadeesan, and Prakash Panangaden. Metrics for labeled Markov systems. In Jos Baeten and Sjouke Mauw, editors, Proceedings of the 10th International Conference on Concurrency Theory, volume 1664 of Lecture Notes in Computer Science, pages 258–273, Eindhoven, The Netherlands, August 1999. Springer-Verlag.
Google Scholar
Josée Desharnais, Vineet Gupta, Radha Jagadeesan, and Prakash Panangaden. Metrics for labelled Markov processes. Theoretical Computer Science, 318(3):323–354, June 2004.
Google Scholar
Josée Desharnais, Radha Jagadeesan, Vineet Gupta, and Prakash Panangaden. The metric analogue of weak bisimulation for probabilistic processes. In Proceedings of the 17th Annual IEEE Symposium on Logic in Computer Science, pages 413–422, Copenhagen, Denmark, July 2002. IEEE.
Google Scholar
Josée Desharnais, François Laviolette, and Mathieu Tracol. Approximate analysis of probabilistic processes: logic, simulation and games. In Proceedings of the 5th International Conference on the Quantitative Evaluation of Systems, pages 264–273, Saint-Malo, France, September 2008. IEEE.
Google Scholar
Wenjie Du, Yuxin Deng, and Daniel Gebler. Behavioural pseudometrics for nondeterministic probabilistic systems. In Martin Fränzle, Deepak Kapur, and Naijun Zhan, editors, Proceedings of the 2nd International Symposium on Dependable Software Engineering: Theories, Tools, and Applications, volume 9984 of Lecture Notes in Computer Science, pages 67–84, Beijing, China, November 2016. Springer-Verlag.
Google Scholar
Yuan Feng and Lijun Zhang. When equivalence and bisimulation join forces in probabilistic automata. In Cliff Jones, Pekka Pihlajasaari, and Jun Sun, editors, Proceedings of the 19th International Symposium on Formal Methods, volume 8442 of Lecture Notes in Computer Science, pages 247–262, Singapore, May 2014. Springer-Verlag.
Google Scholar
Norm Ferns, Prakash Panangaden, and Doina Precup. Metrics for finite Markov decision processes. In Proceedings of the 20th Annual Conference on Uncertainty in Artificial Intelligence, pages 162–169, Banff, Canada, July 2004. AUAI Press.
Google Scholar
Alessandro Giacalone, Chi-Chang Jou, and Scott Smolka. Algebraic reasoning for probabilistic concurrent systems. In Proceedings of the IFIP WG 2.2/2.3 Working Conference on Programming Concepts and Methods, pages 443–458, Sea of Gallilee, Israel, April 1990. North-Holland.
Google Scholar
Susanne Graf and Joseph Sifakis. A modal characterization of observational congruence on finite terms of CCS. In Jan Paredaens, editor, Proceedings of the 11th Colloquium on Automata, Languages and Programming, volume 172 of Lecture Notes in Computer Science, pages 222–234, Antwerp, Belgium, July 1984.Springer-Verlag.
Google Scholar
Matthew Hennessy and Robin Milner. On observing nondeterminism and concurrency. In Jaco de Bakker and Jan van Leeuwen, editors, Proceedings of the 7th Colloquium on Automata, Languages and Programming, volume 85 of Lecture Notes in Computer Science, pages 299–309, Noordwijkerhout, The Netherlands, July 1980. Springer-Verlag.
Google Scholar
Michael Hillerström. Verification of CSS-processes. Master’s thesis, Aalborg University, Aalborg, Denmark, January 1987.
Google Scholar
Leonid Kantorovich and Gennadi Rubinstein. On the space of completely additive functions (in Russian). Vestnik Leningradskogo Universiteta, 3(2):52–59, 1958.
Google Scholar
Leonid Khachiyan. A polynomial algorithm in linear programming (in Russian). Doklady Akademii Nauk SSSR, 244(5):1093–1096, 1979. English translation in Soviet Mathematics Doklady, 20:191–194, 1979.
Google Scholar
Viktor Klee and Christoph Witzgall. Facets and vertices of transportation polytopes. In George Dantzig and Arthur Veinott, editors, Proceedings of 5th Summer Seminar on the Mathematics of the Decision Sciences, volume 11 of Lectures in Applied Mathematics, pages 257–282, Stanford, CA, USA, June/July 1967. AMS.
Google Scholar
Stephen Kleene. Introduction to Metamathematics. Van Nostrand, New York, NY, USA, 1952.
Google Scholar
Yuichi Komorida, Shin-ya Katsumata, Clemens Kupke, Jurriaan Rot, and Ichiro Hasuo. Expressivity of quantitative modal logics : Categorical foundations via codensity and approximation. In Proceedings of the 36th Annual ACM/IEEE Symposium on Logic in Computer Science, pages 1–14, Rome, Italy, June/July 2021.
Google Scholar
Barbara König and Christina Mika-Michalski. (Metric) bisimulation games and real-valued modal logics for coalgebras. In Sven Schewe and Lijun Zhang, editors, Proceedings of the 29th International Conference on Concurrency Theory, volume 118 of Leibniz International Proceedings in Informatics, pages 37:1–37:17, Beijing, China, September 2018. Schloss Dagstuhl - Leibniz-Zentrum für Informatik.
Google Scholar
Barbara König, Christina Mika-Michalski, and Lutz Schröder. Explaining non-bisimilarity in a coalgebraic approach: Games and distinguishing formulas. In Daniela Petrisan and Jurriaan Rot, editors, Proceedings of 15th IFIP WG 1.3 International Workshop on Coalgebraic Methods in Computer Science, volume 12094 of Lecture Notes in Computer Science, pages 133–154, Dublin, Ireland, April 2020. Springer-Verlag.
Google Scholar
Marta Kwiatkowska, Gethin Norman, and David Parker. PRISM 4.0: Verification of probabilistic real-time systems. In Ganesh Gopalakrishnan and Shaz Qadeer, editors, Proceedings of the 23rd International Conference on Computer Aided Verification, volume 6806 of Lecture Notes in Computer Science, pages 585–591, Snowbird, UT, USA, July 2011. Springer-Verlag.
Google Scholar
Kim Larsen and Arne Skou. Bisimulation through probabilistic testing. In Proceedings of the 16th Annual ACM Symposium on Principles of Programming Languages, pages 344–352, Austin, TX, USA, January 1989. ACM.
Google Scholar
David Luenberger and Yinyu Ye. Linear and nonlinear programming. Springer-Verlag, New York, NY, USA, 2008.
Google Scholar
Robin Milner. A Calculus of Communicating Systems, volume 92 of Lecture Notes in Computer Science. Springer-Verlag, Berlin, Germany, 1980.
Google Scholar
James Orlin. A polynomial time primal network simplex algorithm for minimum cost flows. Mathematical Programming, 78(2):109–129, August 1997.
Google Scholar
Vera Pantelic and Mark Lawford. A pseudometric in supervisory control of probabilistic discrete event systems. Discrete Event Dynamic Systems, 22(4):479–510, December 2012.
Google Scholar
David Park. Concurrency and automata on infinite sequences. In Peter Deussen, editor, Proceedings of 5th GI-Conference on Theoretical Computer Science, volume 104 of Lecture Notes in Computer Science, pages 167–183, Karlsruhe, Germany, March 1981. Springer-Verlag.
Google Scholar
Amgad Rady and Franck van Breugel. Java code to explain probabilistic bisimilarity distances for labelled Markov chains, February 2023. https://doi.org/10.5281/zenodo.7626542.
Michel Reniers, Rob Schoren, and Tim Willemse. Results on embeddings between state-based and event-based systems. The Computer Journal, 57(1):73–92, 2014.
Google Scholar
Joshua Sack and Lijun Zhang. A general framework for probabilistic characterizing formulae. In Viktor Kuncak and Andrey Rybalchenko, editors, Proceedings of the 13th International Conference on Verification, Model Checking, and Abstract Interpretation, volume 7148 of Lecture Notes in Computeer Science, pages 396–411, Philadelphia, PA, USA, January 2012. Springer-Verlag.
Google Scholar
Qiyi Tang. Computing probabilistic bisimilarity distances. PhD thesis, York University, Toronto, Canada, August 2018.
Google Scholar
Kathleen Trustrum. Linear programming. Routledge & Kegan Paul, London, UK, 1971.
Google Scholar
Antti Valmari and Giuliana Franceschinis. Simple $O(m \log n)$ time Markov chain lumping. In Javier Esparza and Rupak Majumdar, editors, Proceedings of the 16th International Conference on Tools and Algorithms for the Construction and Analysis of Systems, volume 6015 of Lecture Notes in Computer Science, pages 38–52, Paphos, Cyprus, March 2010. Springer-Verlag.
Google Scholar
Paul Wild and Lutz Schröder. Characteristic logics for behavioural metrics via fuzzy lax extensions. In Igor Konnov and Laura Kovács, editors, Proceedings of the 31st International Conference on Concurrency Theory, volume 171 of Leibniz International Proceedings in Informatics, pages 27:1–27:23, Vienna, Austria, September 2020. Schloss Dagstuhl - Leibniz-Zentrum für Informatik.
Google Scholar
Thorsten Wißmann, Stefan Milius, and Lutz Schröder. Explaining behavioural inequivalence generically in quasilinear time. In Serge Haddad and Daniele Varacca, editors, Proceedings of the 32nd International Conference on Concurrency Theory, volume 203 of Leibniz International Proceedings in Informatics, pages 32:1–32:18, Paris, France, April 2021. Schloss Dagstuhl - Leibniz-Zentrum für Informatik.
Google Scholar

Download references

Acknowledgements

The authors thank the referees for their very detailed and constructive feedback. The second author thanks the Department of Computer Science of the University of Oxford for hosting him for his sabbatical during which part of this research was carried out.

Author information

Authors and Affiliations

DisCoVeri Group, Department of Electrical Engineering and Computer Science, York University, Toronto, Canada
Amgad Rady & Franck van Breugel

Authors

Amgad Rady
View author publications
You can also search for this author in PubMed Google Scholar
Franck van Breugel
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Franck van Breugel .

Editor information

Editors and Affiliations

The Hebrew University of Jerusalem, Jerusalem, Israel
Orna Kupferman
Tallinn University of Technology, Tallinn, Estonia
Pawel Sobocinski

Rights and permissions

Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

Reprints and permissions

Copyright information

About this paper

Cite this paper

Rady, A., van Breugel, F. (2023). Explainability of Probabilistic Bisimilarity Distances for Labelled Markov Chains. In: Kupferman, O., Sobocinski, P. (eds) Foundations of Software Science and Computation Structures. FoSSaCS 2023. Lecture Notes in Computer Science, vol 13992. Springer, Cham. https://doi.org/10.1007/978-3-031-30829-1_14

Download citation

DOI: https://doi.org/10.1007/978-3-031-30829-1_14
Published: 21 April 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-30828-4
Online ISBN: 978-3-031-30829-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Explainability of Probabilistic Bisimilarity Distances for Labelled Markov Chains

Abstract

Similar content being viewed by others

Deciding Probabilistic Bisimilarity Distance One for Labelled Markov Chains

The BisimDist Library: Efficient Computation of Bisimilarity Distances for Markovian Models

Continuity Properties of Distances for Markov Processes

1 Introduction

1.1 Main Results

1.2 Related Work

2 Labelled Markov Chains and Probabilistic Bisimilarity Distances

Definition 1

Definition 2

Definition 3

Definition 4

Definition 5

Theorem 6

Definition 7

Proposition 8

3 A Logical Characterization

Definition 9

Definition 10

Theorem 11

Definition 12

Definition 13

Theorem 14

Proof sketch

4 All Operators are Necessary

Definition 15

Theorem 16

Proof sketch

5 Explainability

Theorem 17

Proof sketch

Proposition 18

Proof sketch

Proposition 19

Proof sketch

Lemma 20

Definition 21

Theorem 22

Proof sketch

Corollary 23

6 Computing \(f_{st}^n\)

7 The Algorithm

8 Conclusion

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation