What’s Wrong with On-the-Fly Partial Order Reduction

  • Stephen F. SiegelEmail author
Open Access
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11562)


Partial order reduction and on-the-fly model checking are well-known approaches for improving model checking performance. The two optimizations interact in subtle ways, so care must be taken when using them in combination. A standard algorithm combining the two optimizations, published over twenty years ago, has been widely studied and deployed in popular model checking tools. Yet the algorithm is incorrect. Counterexamples were discovered using the Alloy analyzer. A fix for a restricted class of property automata is proposed.


Model checking Partial order reduction On-the-fly Spin 

1 Introduction

Partial order reduction (POR) refers to a family of model checking techniques used to reduce the size of the state space that must be explored when verifying a property of a program. The techniques vary, but all share the core observation that when two independent operations are enabled in a state, it is often safe to ignore traces that begin with one of them. A large number of POR techniques have been explored, differing in details such as the range of properties to which they apply. This paper focuses on ample set POR [4], an approach which applies to stutter-invariant properties and is used in the model checker Spin [8].

In the automata-theoretic view of model checking, the negation of the property to be verified is represented by an \(\omega \)-automaton. The basic algorithm computes the product of this automaton with the state space of the program. The language of the product is empty if and only if the program cannot violate the property. On-the-fly model checking refers to an optimization of this basic algorithm in which the enumeration of the reachable program states, computation of the product, and language emptiness check are interleaved, rather than occurring in sequence.

These two optimizations must be combined with care, because they interact in subtle ways.1 A standard algorithm for on-the-fly ample set POR is described in [12] and in further detail in [13]. I shall refer to this algorithm as the combined algorithm. Theorem 4.2 of [13] asserts the soundness of the combined algorithm. A proof of the theorem is also given in [13].

The proof has a gap. This was pointed out in [16, Sect. 5], with details in [15]. The gap was rediscovered in the course of developing mechanized correctness proofs for model checking algorithms; an explicit counterexample to the incorrect proof step was also found ([2, Sect. 8.4.5] and [3, Sect. 5]). The fact that the proof is erroneous, however, does not imply the theorem is wrong. To the best of my knowledge, no one has yet produced a proof or a counterexample for the soundness of the combined algorithm.

In this paper, I show that the combined algorithm is not sound; a counterexample is given in Sect. 3.1. I found this counterexample by modeling the combined algorithm in Alloy and using the Alloy analyzer [11] to check its soundness. Sect. 4 describes this model. Spin’s POR is based on the combined algorithm, and in Sect. 5, Spin is seen to return an incorrect result on a Promela model derived from the theoretical counterexample.

There is a small adjustment to the combined algorithm, yielding an algorithm that is arguably more natural and that returns the correct result on the previous counterexample; this is described in Sect. 6. It turns out this one is also unsound, as demonstrated by another Alloy-produced counterexample. However, in Sect. 7, I show that this variation is sound if certain restrictions are placed on the property automaton.

2 Preliminaries

Definition 1

A finite state program is a triple \(P=\langle T,Q,\iota \rangle \), where Q is a finite set of states, \(\iota \in Q\) is the initial state, and T is a finite set of operations. Each operation \(\alpha \in T\) is a function from a set \(\mathsf {en}_\alpha \subseteq Q\) to Q.

Fix a finite state program \(P=\langle T,Q,\iota \rangle \).

Definition 2

For \(q\in Q\), define \(\mathsf {en}(q)=\{\alpha \in T\mid q\in \mathsf {en}_\alpha \}\).

Definition 3

An execution of P is an infinite sequence of operations \(\alpha _1\alpha _2\cdots \) that generates the sequence of states \(\xi =q_0q_1q_2\cdots \) such that \(q_0=\iota \) and for \(i\ge 0\), \(q_i\in \mathsf {en}_{\alpha _{i+1}}\) and \(q_{i+1}=\alpha _{i+1}(q_i)\). An admissible sequence is any segment of an execution.

Definition 4

A Büchi automaton is a tuple \(\mathcal {B}=\langle S,\varDelta , \varSigma , \delta , F\rangle \), where S is a finite set of automaton states, \(\varDelta \subseteq S\) is the set of initial states, \(\varSigma \) is a finite set called the alphabet, \(\delta \subseteq S\times \varSigma \times S\) is the transition relation, and \(F\subseteq S\) is the set of accepting states. The language of \(\mathcal {B}\), denoted \(\mathcal {L}(B)\), is the set of all \(\xi \in \varSigma ^\omega \) generated by infinite paths in \(\mathcal {B}\) that pass through an accepting state infinitely often.

Fix a finite set \(\textsf {AP}\) of atomic propositions and let \(\varSigma =2^{\textsf {AP}}\).

Fix an interpretation mapping for P, i.e., a function \(L:Q\rightarrow \varSigma \).

Definition 5

The language of P, denoted \(\mathcal {L}(P)\), is the set of all infinite words \(L(q_0)L(q_1)\cdots \in \varSigma ^\omega \), where \(q_0q_1\cdots \) is the sequence of states generated by an execution of P.

Definition 6

A language \(L\subseteq \varSigma ^{\omega }\) is stutter-invariant if, for any \(a_0,a_1,\ldots \in \varSigma \) and positive integers \(i_0,i_1\ldots \), \(a_0a_1\cdots \in L \Leftrightarrow a_0^{i_0}a_1^{i_1}\cdots \in L\), where \(a^i\) denotes the concatenation of i copies of a.

Definition 7

Let \(\mathcal {B}=\langle S,\varDelta , \varSigma , \delta , F\rangle \), be a Büchi automaton with alphabet \(\varSigma \). The product of P and \(\mathcal {B}\) is the Büchi automaton
$$ P\otimes \mathcal {B}= \langle Q\times S, \{\iota \}\times \varDelta , T\times \varSigma , \delta _{\otimes }, Q\times F\rangle , $$
$$ \delta _{\otimes }=\{ (\langle q,s\rangle , \langle \alpha ,\sigma \rangle , \langle q',s'\rangle ) \mid \sigma =L(q) \wedge \langle s,\sigma ,s'\rangle \in \delta \wedge q'=\alpha (q) \}. $$

Note 1

A transition from product state \(x=\langle q,s\rangle \) can be viewed as taking place in two steps. First, a transition \(s\xrightarrow {L(q)}s'\) in \(\mathcal {B}\) executes, leading to an “intermediate state” \(x'=\langle q,s'\rangle \). Then a program transition \(q\xrightarrow {\alpha }q'\) executes, culminating in \(y=\langle q',s'\rangle \). While this is a good mental model, the product automaton does not necessarily contain a transition from x to \(x'\) or from \(x'\) to y. The intermediate state \(x'\) is not even necessarily reachable in the product. The transition in the product goes directly from x to y with label \(\langle \alpha ,L(q)\rangle \).

It is well-known that
$$ \mathcal {L}(P)\cap \mathcal {L}(\mathcal {B})=\emptyset \Leftrightarrow \mathcal {L}(P\otimes \mathcal {B})=\emptyset . $$
In the context of model checking, \(\mathcal {B}\) is used to represent the negation of a desirable property; the program P satisfies the property if, and only if, no execution of P is accepted by \(\mathcal {B}\), i.e., \(\mathcal {L}(P)\,\cap \,\mathcal {L}(\mathcal {B})=\emptyset \). The automaton \(\mathcal {B}\) may be generated from a (negated) LTL formula, but that assumption is not needed here.
The goal of “offline” (not on-the-fly) partial order reduction is to generate some subspace \(P'\) of P with the guarantee that
$$ \mathcal {L}(P')\cap \mathcal {L}(\mathcal {B})=\emptyset \Leftrightarrow \mathcal {L}(P)\cap \mathcal {L}(\mathcal {B})=\emptyset $$
The emptiness of \(\mathcal {L}(P'\otimes \mathcal {B})=\mathcal {L}(P')\cap \mathcal {L}(\mathcal {B})\) can be decided in various ways, such as a nested depth first search (NDFS) [5].

3 On-the-Fly Partial Order Reduction

In on-the-fly model checking, the state space of the product automaton is enumerated directly, without first enumerating the program states. Adding POR to the mix means that at each state reached in the product automaton, some subset of enabled transitions will be explored. The goal is to ensure that if the language of the full product automaton is nonempty, then the language of the resulting reduced automaton must be nonempty.

To make this precise, fix a finite state program \(P=\langle T,Q,\iota \rangle \), a set \(\textsf {AP}\) of atomic propositions, an interpretation \(L:Q\rightarrow \varSigma =2^{\textsf {AP}}\), and Büchi automaton \(\mathcal {B}=\langle S,\varDelta ,\varSigma ,\delta ,F\rangle \). Let \(\mathcal {A}=P\otimes \mathcal {B}\).

Definition 8

A function \(\mathsf {amp}:Q\times S\rightarrow 2^T\) is an ample selector if \(\mathsf {amp}(q,s)\subseteq \mathsf {en}(q)\) for all \(q\in Q, s\in S\). Each \(\mathsf {amp}(q,s)\) is an ample set.

An ample selector determines a subautomaton \(\mathcal {A}'=\mathsf {reduced}(\mathcal {A},\mathsf {amp})\) of \(\mathcal {A}\): \(\mathcal {A}'\) is defined exactly as in Definition 7, except that the transition relation has the additional restriction that \(\alpha \in \mathsf {amp}(q,s')\):
$$\begin{aligned} \mathcal {A}'=\langle Q\times S, \{\iota \}\times \varDelta , T\times \varSigma , \delta ', Q\times F\rangle \end{aligned}$$
$$\begin{aligned} \begin{aligned} \delta ' =&\left\{ (\langle q,s\rangle , \langle \alpha ,\sigma \rangle , \langle q',s'\rangle )\in (Q\times S) \times (T\times \varSigma ) \times (Q\times S) \mid \right. \\&\left. \sigma =L(q) \wedge \langle s,\sigma ,s'\rangle \in \delta \wedge \alpha \in \mathsf {amp}(q,s')\wedge q'=\alpha (q) \right\} . \end{aligned} \end{aligned}$$

Definition 9

An ample selector \(\mathsf {amp}\) is POR-sound if the following holds:
$$ \mathcal {L}(\mathsf {reduced}(\mathcal {A},\mathsf {amp}))=\emptyset \Leftrightarrow \mathcal {L}(P)\cap \mathcal {L}(\mathcal {B})=\emptyset . $$

The goal is to define some constraints on an ample selector that guarantee it is POR-sound. Before stating the constraints, we need two more concepts:

Definition 10

An independence relation is an irreflexive and symmetric relation \(I\subseteq T\times T\) satisfying the following: if \((\alpha ,\beta )\in I\) and \(q\in \mathsf {en}_\alpha \cap \mathsf {en}_\beta \), then \(\alpha (q)\in \mathsf {en}_\beta \), \(\beta (q)\in \mathsf {en}_\alpha \), and \(\alpha (\beta (q))=\beta (\alpha (q))\).

Fix an independence relation I. We say \(\alpha \) and \(\beta \) are dependent if \((\alpha ,\beta )\not \in I\).

Definition 11

An operation \(\alpha \in T\) is invisible with respect to \(L\) if, for all \(q\in \mathsf {en}_\alpha \), \(L(q)=L(\alpha (q))\).

Note 2

The definition in [13] is slightly different. Given an LTL formula \(\phi \) over \(\textsf {AP}\), let \(\textsf {AP}'\) be the set of atomic propositions occurring syntactically in \(\phi \). The definition in [13] says \(\alpha \) is invisible in \(\phi \) if, for all \(p\in \textsf {AP}'\) and \(q\in \mathsf {en}_\alpha \), \(p\in L(q)\Leftrightarrow p\in L(\alpha (q))\). However, there is no loss of generality using Definition 11, since one can define a new interpretation \(L':Q\rightarrow 2^{\textsf {AP}'}\) by \(L'(q)=L(q)\cap \textsf {AP}'\). Then \(\alpha \) is invisible for \(\phi \) if, and only if, \(\alpha \) is invisible with respect to \(L'\), and the results of this paper can be applied without modification to P, \(\textsf {AP}'\), and \(L'\).

We now define the following constraints on an ample selector \(\mathsf {amp}\):2

For all \(q\in Q\), \(s\in S\): \(\mathsf {en}(q)\ne \emptyset \implies \mathsf {amp}(q,s)\ne \emptyset \).


For all \(q\in Q\), \(s\in S\): in any admissible sequence in P starting from q, no operation in \(T\setminus \mathsf {amp}(q,s)\) that is dependent on an operation in \(\mathsf {amp}(q,s)\) can occur before some operation in \(\mathsf {amp}(q,s)\) occurs.


For all \(q\in Q\), \(s\in S\): if \(\mathsf {amp}(q,s)\ne \mathsf {en}(q)\) then \(\forall \alpha \in \mathsf {amp}(q,s)\), \(\alpha \) is invisible.


There is a depth-first search of \(\mathcal {A}'=\mathsf {reduced}(\mathcal {A},\mathsf {amp})\) with the following property: whenever there is a transition in \(\mathcal {A'}\) from a node \(\langle q,s\rangle \) on the top of the stack to a node \(\langle q',s'\rangle \) on the stack, \(\mathsf {amp}(q,s')=\mathsf {en}(q)\).

Condition C3 is the interesting one. The combined algorithm of [13] enforces it using a DFS (the outer search of the NDFS) of the reduced space and the following protocol: given a new state \(\langle q,s\rangle \) that has just been pushed onto the stack, first iterate over all Büchi transitions \(\langle s,L(q),s'\rangle \) departing from s and labeled by L(q). For each of these, a candidate ample set for \(\mathsf {amp}(q,s')\) that satisfies the first three conditions is computed; this computation does not depend on \(s'\). If any operation in that candidate set leads back to a state on the search stack (a “back edge”), a different candidate is tried and the process is repeated until a satisfactory one is found. If no such candidate is found, \(\mathsf {en}(q)\) is used for the ample set.

Hence the process for choosing the ample set depends on the current state of the search. If \(y_1\ne y_2\), it is not necessarily the case that \(\mathsf {amp}(x,y_1)=\mathsf {amp}(x,y_2)\), because it is possible that when \(\langle x,y_1\rangle \) was encountered, a back edge existed for a candidate, but when \(\langle x,y_2\rangle \) was encountered, there was no back edge.

3.1 Counterexample

Theorem 4.2 of [13] can be expressed as follows: if \(\mathcal {L}(\mathcal {B})\) is stutter-invariant and the language of an LTL formula, and \(\mathsf {amp}\) satisfies C0–C3, then \(\mathsf {amp}\) is POR-sound.
Fig. 1.

Counterexample to combined theorem. Left: program and interpretation. Center: property automaton \(\mathcal {B}_1\) and ample selector function. Right: the reachable product state space; dashed edges are in the full, but not reduced, space.

A counterexample to this claim is given in Fig. 1. The program consists of two states, A and B, and two operations, \(\alpha \) and \(\beta \). There is a single atomic proposition, p, which is false at A and true at B. Note that \(\alpha \) and \(\beta \) are independent. Also, \(\alpha \) is invisible, and \(\beta \) is not.

The property automaton, \(\mathcal {B}_1\), is shown in Fig. 1 (center top). It has two states, numbered 0 and 1. State 1 is the sole accepting state. The language consists of all infinite words of the following form: a finite nonempty prefix of \(\varnothing \)s followed by an infinite sequence of \(\{p\}\)s. This language is stutter-invariant, and is the language of the LTL formula \((\lnot p)\wedge ((\lnot p)\mathbf U \, \mathbf G p)\).

The ample selector is specified by the table (center bottom). Notice that \(\mathsf {amp}(A,1)\ne \mathsf {en}(A)\), but the other three ample sets are full. C0 holds because the ample sets are never empty. C1 holds because \(\beta \) is independent of \(\alpha \). C2 holds because \(\alpha \) is invisible. The reachable product space is shown in Fig. 1 (right). In any DFS of \(\mathsf {reduced}(\mathcal {A},\mathsf {amp})\), the only back edge is the self-loop on A0 labeled \(\langle \alpha ,\varnothing \rangle \). Since \(\mathsf {amp}(A,0)\) is full, C3 holds. Yet there is an accepting path in the full space, but not in the reduced space.

4 Alloy Models of POR Schemes

Alloy is a “lightweight formal methods” language and tool. It has been used in a wide variety of contexts, from exploring software designs to studying weak memory-consistency models. An Alloy model specifies signatures, each of which defines a type, relations on signatures, and constraints on the signatures and relations. Constraints are expressed in a logic that combines elements of first order logic and relational logic, and includes a transitive closure operator. An instance of a model assigns a finite set of atoms to each signature, and a finite set of tuples (of the right type) to each relation, in such a way that the constraints are satisfied. The Alloy analyzer can be used to check that an assertion holds on all instances in which the sizes of the signatures are within some specified bounds. The analyzer converts the question of the validity of the assertion into a SAT problem and invokes a SAT solver. Based on the result, it reports either that the assertion holds within the given bounds, or it produces an instance of the model violating the assertion.

I developed an Alloy model to search for counterexamples to various POR claims, such as the one in Sect. 3.1. The model encodes the main concepts of the previous two sections, including program, operations, interpretation, invisibility and independence, property automaton, the product space, ample selectors and the constraints on them, and a language emptiness predicate. The model culminates in an assertion which states that an ample selector satisfying the four constraints is POR-sound.

I was not able to find a way to encode stutter-invariance. In the end, I developed a small set of Büchi automata based on my own intuition of what would make interesting tests. I encoded these in Alloy and used the analyzer to explore all possible programs and ample selectors for each.

The first part of the model is a simple encoding of a finite state automaton. The following is a listing of file ba.als:
The alphabet is some unconstrained set Sigma. The set of states is represented by signature BState. There is a single initial state, and any number of accepting states. Each transition has a source and destination state, and label. Relations declared within a signature declaration have that signature as an implicit first argument. So, for example, src is a binary relation of type \(\texttt {BTrans}\times \texttt {BState}\). Furthermore, the relation is many-to-one: each transition has exactly one BState atom associated to it by the src relation.
The remaining concepts are incorporated into module Open image in new window :
The facts are constraints that any instance must satisfy; some of the facts are given names for readability. A pred declaration defines a (typed) predicate.

Most aspects of this model are self-explanatory; I will comment only on the less obvious features. The relations nextFull and nextReduced represent the next state relations in the full and reduced spaces, respectively. They are declared in ProdState, but specified completely in the final fact on lines 56–58. Strictly speaking, one could remove those predicates and substitute their definitions, but this seemed more convenient. Line 60 asserts that a product state is determined uniquely by its program and property components. Line 61 specifies the initial product state.

Line 59 insists that only states reachable (in the full space) from the initial state will be included in an instance (* is the reflexive transitive closure operator). Lines 62–64 specify the converse. Hence in any instance of this model, ProdState will consist of exactly the reachable product states in the full space.

The encoding of C1 is based on the following observation: given \(q\in Q\) and a set A of operations enabled at q, define \(r\subseteq Q\times Q\) by removing from the program’s next-state relation all edges labeled by operations in A. Then “no operation dependent on an operation in A can occur unless an operation in A occurs first” is equivalent to the statement that on any path from q using edges in r, all enabled operations encountered will either be in A or independent of every operation in A.

Condition C3 is difficult to encode, in that it depends on specifying a depth-first search. I have replaced it with a weaker condition, which is similar to a well-known cycle proviso in the offline theory:
\(\mathbf C3 '\)

In any cycle in \(\mathsf {reduced}(\mathcal {A}, \mathsf {amp})\), there is a transition from \(\langle q,s\rangle \) to \(\langle q',s'\rangle \) for which \(\mathsf {amp}(q,s')=\mathsf {en}(q)\).

Equivalently: if one removes from the reduced product space all such transitions, then the resulting graph should have no cycles. This is the meaning of lines 50–54 Open image in new window is the strict transitive closure operator).
The next step is to create tests for specific property automata. This example is for the automaton \(\mathcal {B}_1\) of Fig. 1:
The final step is a test that combines the modules above:
It places upper bounds on the numbers of operations, program states, and product states while checking the soundness assertion. Using the Alloy analyzer to check the assertion above results in a counterexample like the one in Fig. 1. The runtime is a fraction of a second. The Alloy instance uses two uninterpreted atoms for the elements of Sigma; I have simply substituted the sets \(\varnothing \) and \(\{p\}\) for them to produce Fig. 1. As we have seen, this counterexample happens to also satisfy the stronger constraint C3.

5 Spin

The POR algorithm used by Spin is described in [10] and is similar to the combined algorithm. We can see what Spin actually does by encoding examples in Promela and executing Spin with and without POR.
Fig. 2.

Promela representation of counterexample using \(\mathcal {B}_1\) of Fig. 1

Figure 2 shows an encoding of the example of Fig. 1. Transition \(\alpha \) corresponds to the assignment x = 0, where x is a variable local to p1. Transition \(\beta \) corresponds to the assignment p = 1, where p is a shared variable. Applying Spin with the following commands allows one to see the structure of the program graphs for each process, as well as each step in the search of the full space:
I did this with Spin version 6.4.9, the latest stable release. The output indicates that 4 states and 5 transitions are explored, and one state is matched—exactly as in Fig. 1 (right). As expected, the output also reports a violation—a path to an accepting cycle that corresponds to the transition from A0 to B1 followed by the self-loop on B1 repeated forever.

Repeat this experiment without the -DNOREDUCE, however, and Spin finds no errors. The output indicates that it misses the transition from A0 to B1.

6 Ignoring the Intermediate States

An interesting aspect of the combined algorithm is that the ample set is a function of an intermediate state. I.e., given a product state \(x=\langle q,s\rangle \), the ample set is determined by the intermediate state \(x'=\langle q,s'\rangle \) obtained after executing a property transition. This introduces a difference between the on-the-fly scheme and offline schemes, where there is no notion of intermediate state. It also introduces other complexities. For example, it is possible that \(x'\) was reached earlier in the search through some other state \(\langle q,s_2\rangle \), because of a property transition \(s_2\xrightarrow {L(q)}s'\). How does the algorithm guarantee that the ample set selected for \(x'\) will be the same as the earlier choice? This issue is not addressed in [13] or [10].

These problems go away if one simply makes the ample set a function of the source product state x. The intermediate states do not have to play a role. Specifically, given an ample selector \(\mathsf {amp}\), define \(\mathsf {reduced}_2(\mathcal {A}, \mathsf {amp})\) as in (1) and (2), except replace “\(\alpha \in \mathsf {amp}(q,s')\)” in (2) with “\(\alpha \in \mathsf {amp}(q,s)\)”. Perform the same substitution in C3 and call the resulting condition \(\mathbf C3 _1\). The weaker version of \(\mathbf C3 _1\) is simply:
\(\mathbf C3 '_1\)

In any cycle in \(\mathsf {reduced}_2(\mathcal {A}, \mathsf {amp})\) there is a state \(\langle q,s\rangle \) with \(\mathsf {amp}(q,s)=\mathsf {en}(q)\).

Conditions C0C2 are unchanged. I refer to this scheme as V1, and to the original combined algorithm as V0. The Alloy model of V0 in Sect. 4 can be easily modified to represent V1.

Using V1, the example of Fig. 1 is no longer a counterexample. In fact, Alloy reports there are no counterexamples using \(\mathcal {B}_1\), at least for small bounds on the program size. Figure 5 gives detailed results for this and other Alloy experiments.

Unfortunately, Alloy does find a counterexample for a slightly more complicated property automaton, \(\mathcal {B}_2\), which is shown in Fig. 3.
Fig. 3.

Counterexample to V1 with \(\mathcal {B}_2\) (center). A0 and A2 have proper ample set \(\{\alpha \}\).

The program is the same as the one in Sect. 3.1. Automaton \(\mathcal {B}_2\) has four states, with state 3 the sole accepting state. The language is the same as that of \(\mathcal {B}_1\): all infinite words formed by concatenating a finite nonempty prefix of \(\varnothing \)s and an infinite sequence of \(\{p\}\)s. If the prefix has odd length, the accepting run begins with the transition \(0\rightarrow 1\), otherwise it begins with the transition \(0\rightarrow 2\).

In the ample selector, only A0 and A2 are not fully enabled:
$$\begin{array}{c|cccc} \mathsf {amp}&{}0&{}1&{}2&{}3\\ \hline A&{} \{\alpha \} &{} \{\alpha ,\beta \} &{} \{\alpha \} &{} \{\alpha ,\beta \} \\ B&{} \{\alpha \} &{} \{\alpha \} &{} \{\alpha \} &{} \{\alpha \}. \end{array}$$
C0C2 hold for the reasons given in Sect. 3.1. \(\mathbf C3 _1\) holds for any DFS in which A2 is pushed onto the stack before A1. In that case, there is no back edge from A2; there will be a back edge when A1 is pushed, but A1 is fully enabled.

7 What’s Right

In this section, I show that POR scheme V1 of Sect. 6 is sound if one introduces certain assumptions on the property automaton. The following definition is similar to the notion of stutter invariant (SI) automaton in [6] and to that of closure under stuttering in [9]. The main differences derive from the use of Muller automata in [6] and Büchi transition systems in [9], while we are dealing with ordinary Büchi automata.

Definition 12

A Büchi automaton \(\mathcal {B}=\langle S, \{s_{\textit{init}}\}, \varSigma , \delta , F\rangle \), is in SI normal form if it has a single initial state \(s_{\textit{init}}\) with no incoming edges, and for each \(s\in S\setminus \{s_{\textit{init}}\}\), there is some \(a_s\in \varSigma \) such that the following all hold:
  1. 1.

    Every edge terminating in s is labeled \(a_s\).

  2. 2.

    s has exactly one outgoing edge with label \(a_s\).

  3. 3.

    If \(s\not \in F\) then \(\langle s,a_s,s\rangle \in \delta \).

  4. 4.

    If \(\langle s,a_s,s\rangle \not \in \delta \), then there exists \(s^{\sharp } \in S\setminus F\) such that (i) \(\langle s,a_s,s^{\sharp }\rangle \in \delta \) and (ii) for all \(a\in \varSigma \) and \(s'\in S\), \(\langle s,a,s'\rangle \in \delta \Leftrightarrow \langle s^{\sharp },a,s'\rangle \in \delta \).


Lemma 1

Let \(\mathcal {B}\) be a Büchi automaton in SI normal form. Suppose \(a,b\in \varSigma \) and \(a\ne b\). Both of the following hold:
  1. 1.

    If \(s_1{\mathop {\rightarrow }\limits ^{a}}s_2{\mathop {\rightarrow }\limits ^{b}}s_3\) is a path in \(\mathcal {B}\), then for some \(s_2'\in S\), \(s_1{\mathop {\rightarrow }\limits ^{a}}s_2{\mathop {\rightarrow }\limits ^{a}}s_2'{\mathop {\rightarrow }\limits ^{b}}s_3\) is a path in \(\mathcal {B}\).

  2. 2.

    If \(s_1{\mathop {\rightarrow }\limits ^{a}}s_2{\mathop {\rightarrow }\limits ^{a}}s_3{\mathop {\rightarrow }\limits ^{b}}s_4\) is a path in \(\mathcal {B}\), then \(s_1{\mathop {\rightarrow }\limits ^{a}}s_2{\mathop {\rightarrow }\limits ^{b}}s_4\) is a path in \(\mathcal {B}\). Moreover, if \(s_3\) is accepting, then \(s_2\) is accepting.

Following the approach of [6], one can show that the language of an automaton in SI normal form is stutter-invariant. Moreover, any Büchi automaton with a stutter-invariant language can be transformed into SI normal form without changing the language. The conversion satisfies \(|S'|\le O(|\varSigma ||S|)\), where |S| and \(|S'|\) are the number of states in the original and new automaton, respectively. For details and proofs, see [17]. An example is given in Fig. 4; the language of \(\mathcal {B}_3\) (or \(\mathcal {B}_4\)) consists of all words with a finite number of \(\{p\}\)s.
Fig. 4.

Property automaton \(\mathcal {B}_3\) and result of transformation to SI normal form, \(\mathcal {B}_4\).

Theorem 1

Suppose \(\mathcal {B}\) is in SI normal form and \(\mathsf {amp}:Q\times S\rightarrow 2^T\) is an ample selector satisfying C0C2 and \(\mathbf {C3}{} \mathbf{'_1}\). Then \(\mathsf {amp}\) is POR-sound.

The remainder of this section is devoted to the proof of Theorem 1. The proof is similar to the proof of the offline case in [4].

Let \(\theta \) be an accepting path in the full space \(\mathcal {A}\). An infinite sequence of accepting paths \(\pi _0,\pi _1,\ldots \) will be constructed, where \(\pi _0=\theta \). For each \(i\ge 0\), \(\pi _i\) will be decomposed as \(\eta _i\circ \theta _i\), where \(\eta _i\) is a finite path of length i in the reduced space, \(\theta _i\) is an infinite path, \(\eta _i\) is a prefix of \(\eta _{i+1}\), and \(\circ \) denotes concatenation. For \(i=0\), \(\eta _0\) is empty and \(\theta _0=\theta \).

Assume \(i\ge 0\) and we have defined \(\eta _j\) and \(\theta _j\) for \(j\le i\). Write
$$\begin{aligned} \theta _i \,\, = \,\, \langle q_0,s_0\rangle \xrightarrow {\langle \alpha _1,\sigma _0\rangle } \langle q_1,s_1\rangle \xrightarrow {\langle \alpha _2,\sigma _1\rangle } \cdots \end{aligned}$$
where \(\sigma _k=L(q_k)\) for \(k\ge 0\). Then \(\eta _{i+1}\) and \(\theta _{i+1}\) are defined as follows. Let \(A=\mathsf {amp}(q_0,s_0)\). There are two cases:

Case 1: \(\alpha _1\in A\). Let \(\eta _{i+1}\) be the path obtained by appending the first transition of \(\theta _i\) to \(\eta _i\), and \(\theta _{i+1}\) the path obtained by removing the first transition from \(\theta _i\).

Case 2: \(\alpha _1\not \in A\). Then there are two sub-cases:

Case 2a: Some operation in A occurs in \(\theta _i\). Let n be the index of the first occurrence, so that \(\alpha _n\in A\), but \(\alpha _j\not \in A\) for \(1\le j<n\). By C1, \(\alpha _j\) and \(\alpha _n\) are independent for \(1\le j<n\). By repeated application of the independence property, there are paths in P
By C2, \(\alpha _n\) is invisible, whence \(L(q_{j+1}')=\sigma _j\) for \(0\le j\le n-2\), and \(\sigma _{n-1}=\sigma _n\). Hence the admissible sequence
$$\begin{aligned} q_0 {\mathop {\rightarrow }\limits ^{\alpha _n}} q_1' {\mathop {\rightarrow }\limits ^{\alpha _1}} q_2' {\mathop {\rightarrow }\limits ^{\alpha _2}} q_3' \rightarrow \cdots {\mathop {\rightarrow }\limits ^{\alpha _{n-2}}} q_{n-1}' {\mathop {\rightarrow }\limits ^{\alpha _{n-1}}} q_n {\mathop {\rightarrow }\limits ^{\alpha _{n+1}}} q_{n+1} {\mathop {\rightarrow }\limits ^{\alpha _{n+2}}} q_{n+2}\rightarrow \cdots \end{aligned}$$
generates the word
$$\begin{aligned} \sigma _0 \sigma _0 \sigma _1 \sigma _2 \cdots \sigma _{n-2} \sigma _n\sigma _{n+1}\sigma _{n+2}\cdots . \end{aligned}$$
Now the projection of \(\theta _i\) onto \(\mathcal {B}\) has the form
$$ s_0 \xrightarrow {\sigma _0} s_1 \xrightarrow {\sigma _1} s_2 \xrightarrow {\sigma _2} \cdots \xrightarrow {\sigma _{n-2}} s_{n-1} \xrightarrow {\sigma _n} s_n \xrightarrow {\sigma _n} s_{n+1} \xrightarrow {\sigma _{n+1}} s_{n+2} \xrightarrow {\sigma _{n+2}} \cdots $$
since \(\sigma _{n-1}=\sigma _n\). By Lemma 1, there is a path in \(\mathcal {B}\)
$$\begin{aligned} s_0 \xrightarrow {\sigma _0} s_1 \xrightarrow {\sigma _0} s_1' \xrightarrow {\sigma _1} s_2 \xrightarrow {\sigma _2} \cdots \xrightarrow {\sigma _{n-2}} s_{n-1} \xrightarrow {\sigma _n} s_n \xrightarrow {\sigma _{n+1}} s_{n+2} \xrightarrow {\sigma _{n+2}} \cdots \end{aligned}$$
which accepts the word (5). Composing (4) and (6) therefore gives a path through the product space. Removing the first transition (labeled \(\langle \alpha _n,\sigma _0\rangle \)) from this path yields \(\theta _{i+1}\). Appending that transition to \(\eta _i\) yields \(\eta _{i+1}\).
Case 2b: No operation in A occurs in \(\theta _i\). By C0, A is nonempty. Let \(\beta \in A\). By C2, every operation in \(\theta _i\) is independent of \(\beta \). With an argument that is similar to the one for Case 2a, we can see there is a path in the product space for which the projection onto the program component has the form
$$ q_0 {\mathop {\rightarrow }\limits ^{\beta }} q_1' {\mathop {\rightarrow }\limits ^{\alpha _1}} q_2' {\mathop {\rightarrow }\limits ^{\alpha _2}} q_3' \rightarrow \cdots $$
and the projection onto the property component has the form
$$ s_0 \xrightarrow {\sigma _0} s_1 \xrightarrow {\sigma _0} s_1' \xrightarrow {\sigma _1} s_2 \xrightarrow {\sigma _2} \cdots . $$
Removing the first transition from this path yields \(\theta _{i+1}\). Appending that transition to \(\eta _i\) yields \(\eta _{i+1}\). This completes the definitions of \(\eta _{i+1}\) and \(\theta _{i+1}\).

Let \(\eta \) be the limit of the \(\eta _i\). Clearly \(\eta \) is an infinite path through the reduced product space, starting from the initial state. We must show that it passes through an accepting state infinitely often. To do so, we must examine more closely the sequence of property states through which each \(\theta _i\) passes.

Let \(i\ge 0\), and \(s_0\) the final state of \(\eta _i\). Say \(\theta _i\) passes through states \(s_0s_1s_2\cdots \). Then the final state of \(\eta _{i+1}\) will be \(s_1\), and the state sequence of \(\theta _{i+1}\) is determined by the three cases as follows:
$$\begin{aligned} \begin{array}{rl} \text {Case 1:}\ &{} s_1 s_2 \cdots \\ \text {Case 2a:}\ &{} s_1 s_1' s_2 \cdots s_ns_{n+2}\cdots (s_{n+1}\in F \implies s_n\in F)\\ \text {Case 2b:}\ &{} s_1 s_1' s_2 \cdots \end{array} \end{aligned}$$
We first claim that for all \(i\ge 0\), \(\theta _i\) passes through an accepting state infinitely often. This holds for \(\theta _0\), which is an accepting path by assumption. Assume it holds for \(\theta _i\). In each case of (7), we see that the state sequence of \(\theta _{i+1}\) has a suffix which is a suffix of the state sequence of \(\theta _i\), so the claim holds for \(\theta _{i+1}\).

Definition 13

For any path \(\xi =s_0\rightarrow s_1\rightarrow \cdots \) through \(\mathcal {B}\) which passes through an accepting state infinitely often, define the accepting distance of \(\xi \), written \(\mathsf {AD}(\xi )\), to be the minimum \(k\ge 1\) for which \(s_k\) is accepting.

Lemma 2

Let \(i\ge 0\) and say the state sequence of \(\theta _i\) is \(s_0s_1s_2\cdots \). If \(s_1\) is not accepting then one of the following holds:
  • Case 1 holds and \(\mathsf {AD}(\theta _{i+1})<\mathsf {AD}(\theta _i)\), or

  • Case 2a or 2b holds and \(\mathsf {AD}(\theta _{i+1})\le \mathsf {AD}(\theta _i)\).


If \(s_1\) is not accepting then there is some \(k\ge 2\) for which \(s_k\) is accepting. The result follows by examining (7). In Case 1, the accepting distance decreases by 1. In Case 2a, the accepting distance is either unchanged (if \(k\le n\)) or decreases by 1 (if \(k>n\)). In Case 2b, the accepting distance is unchanged.    \(\square \)

Lemma 3

For an infinite number of \(i\ge 0\), Case 1 holds for \(\theta _i\).


Suppose not. Then there is some \(i\ge 0\) such that Case 2 holds for all \(j\ge i\). Let \(\alpha _1\) be the first program operation of \(\theta _i\). Then \(\alpha _1\) is the first program operation of \(\theta _j\), for all \(j\ge i\). Furthermore, for all \(j\ge i\), \(\alpha _1\) is not in the ample set of the final state of \(\eta _j\). Since the product space has only a finite number of states, this means there is a cycle in the reduced space for which \(\alpha _1\) is enabled but never in the ample set, contradicting \(\mathbf C3 '_1\).    \(\square \)

We now show that \(\eta \) passes through an accepting state infinitely often. Note that, if \(\mathsf {AD}(\theta _i)=1\), an accepting state is added to \(\eta _i\) to form \(\eta _{i+1}\). Suppose \(\eta \) does not pass through an accepting state infinitely often. Then there is some \(i\ge 0\) such that for all \(j\ge i\), \(\mathsf {AD}(\theta _j)>1\). By Lemma 2, \((\mathsf {AD}(\theta _j))_{j\ge i}\) is a nonincreasing sequence of positive integers, and by Lemma 3, this sequence strictly decreases infinitely often, a contradiction. This completes the proof of Theorem 1.

Remark 1

The proof goes through with minor modifications for V0 in place of V1. Let \(A=\mathsf {amp}(q_0,s_1)\) instead of \(\mathsf {amp}(q_0,s_0)\). In Case 2a (similarly in 2b), note the first transition \(s_0 \xrightarrow {\sigma _0}s_1\) in the path in \(\mathcal {B}\) remains in the new path (6).

8 Summary of Experimental Results and Conclusion

We have seen that standard ways of combining POR and on-the-fly model checking are unsound. This is not only a theoretical issue—the defect in the algorithm is realized in Spin, which can produce an incorrect result. A modification (V1) seems to help, but is still not enough to guarantee soundness for any Büchi automaton with a stutter-invariant language. However, any such automaton can be transformed into a normal form for which algorithm V1 is sound.
Fig. 5.

Bounded verification of soundness of POR schemes V0 and V1 on various Büchi automata using Alloy. \(\mathcal {B}_5\) represents all automata in SI normal form within the bounds. Each run resulted in either a counterexample ( Open image in new window ) or not ( Open image in new window ).

Alloy proved useful for reasoning about the algorithms and generating small counterexamples. A summary of the Alloy experiments and results is given in Fig. 5. These were run on an 8-core 3.7GHz Intel Xeon W-2145 and used the plingeling SAT solver [1].3 In addition to the experiments already discussed, Alloy found no soundness counterexamples for property automata \(\mathcal {B}_3\) or \(\mathcal {B}_4\), using V0 or V1. In the case of \(\mathcal {B}_4\), this is what Theorem 1 predicts. For further confirmation of Theorem 1, I constructed a general Alloy model of Büchi automata in SI normal form, represented by \(\mathcal {B}_5\) in the table. Alloy confirms that both V0 and V1 are sound for all such automata within small bounds on program and automata size.

It is possible that the use of the normal form, while correct, cancels out the benefits of POR. A comprehensive exploration of this issue is beyond the scope of this paper, but I can provide data on one non-trivial example. I encoded an n-process version of Peterson’s mutual exclusion algorithm in Promela, and used Spin to verify starvation-freedom for one process in the case \(n=5\). If p is the predicate that holds whenever the process is enabled, a trace violates this property if p holds only a finite number of times in the trace, i.e., if the trace is in \(\mathcal {L}(\mathcal {B}_3)=\mathcal {L}(\mathcal {B}_4)\). Figure 6 shows the results of Spin verification using \(\mathcal {B}_3\) without POR, and using \(\mathcal {B}_3\) and \(\mathcal {B}_4\) with POR. The results indicate that POR significantly improves performance on this problem, and that using the normal form \(\mathcal {B}_4\) in place of \(\mathcal {B}_3\) actually improves performance further by a small amount.
Fig. 6.

Spin verification of starvation-freedom for 5-process Peterson. Using the SI normal form \(\mathcal {B}_4\) instead of the smaller \(\mathcal {B}_3\) has little impact on performance.

It is likely that V1 is sound for other interesting classes of automata. Observe, for example, that \(\mathcal {B}_2\) of Fig. 3 has states u where the language of the automaton with u considered as the initial state is not stutter-invariant. If we restrict to automata in which every state has a stutter-invariant language, is V1 sound? I have neither a proof nor a counterexample. (This is certainly not true of V0, as \(\mathcal {B}_1\) is a counterexample.) To explore this question, it would help to find a way to encode the stutter-invariant property—or a suitable approximation—in Alloy.

Finally, the proof of Theorem 1 is complicated and might also be flawed. Recent work mechanizing such proofs [3] represents an important advance in raising the level of assurance in model checking algorithms. It would be interesting to see if the proof of this theorem is amenable to such methods. However, constructing such proofs requires far more effort than the Alloy approach described here. One possible approach moving forward is to use tools such as Alloy when prototyping a new algorithm, to get feedback quickly and root out bugs. Once Alloy no longer finds any counterexamples, one could then expend the considerable effort required to construct a formal mechanized proof.


  1. 1.

    Previous work, for example, has dealt with problems, distinct from those discussed in this paper, that arise when combining nested depth first search and POR [7, 14].

  2. 2.

    I am using the numbering from [4]. In [13], C2 and C3 are swapped.

  3. 3.

    All artifacts needed to reproduce the experiments reported in this paper can be downloaded from



I am grateful to Ganesh Gopalakrishnan and Julian Brunner for fruitful conversations on partial order reduction, to Gerard Holzmann for help with Spin, and to the anonymous reviewers for suggestions that improved this paper. This material is based upon work by the RAPIDS Institute, supported by the U.S. Department of Energy, Office of Science, Office of Advanced Scientific Computing Research, Scientific Discovery through Advanced Computing (SciDAC) program. Funding was also provided by the U.S. National Science Foundation under award CCF-1319571.


  1. 1.
    Biere, A.: CaDiCaL, Lingeling, Plingeling, Treengeling, YalSAT Entering the SAT Competition 2017. In: Balyo, T., Heule, M., Järvisalo, M. (eds.) Proceedings of SAT Competition 2017 - Solver and Benchmark Descriptions. Department of Computer Science Series of Publications B, vol. B-2017-1, pp. 14–15. University of Helsinki (2017)Google Scholar
  2. 2.
    Brunner, J.: Implementation and verification of partial order reduction for on-the-fly model checking. Master’s thesis, Technische Universität München, Department of Computer Science, July 2014.
  3. 3.
    Brunner, J., Lammich, P.: Formal verification of an executable LTL model checker with partial order reduction. J. Autom. Reason. 60, 3–21 (2018). Scholar
  4. 4.
    Clarke Jr., E.M., Grumberg, O., Peled, D.A.: Model Checking. MIT Press, Cambridge (1999)Google Scholar
  5. 5.
    Courcoubetis, C., Vardi, M., Wolper, P., Yannakakis, M.: Memory-efficient algorithms for the verification of temporal properties. Form. Methods Syst. Des. 1(2), 275–288 (1992). Scholar
  6. 6.
    Etessami, K.: Stutter-invariant languages, \(\omega \)-automata, and temporal logic. In: Halbwachs, N., Peled, D. (eds.) CAV 1999. LNCS, vol. 1633, pp. 236–248. Springer, Heidelberg (1999). Scholar
  7. 7.
    Holzmann, G., Peled, D., Yannakakis, M.: On nested depth first search. In: The Spin Verification System, DIMACS - Series in Discrete Mathematics and Theoretical Computer Science, vol. 32, pp. 23–31. AMS and DIMACS (1997).
  8. 8.
    Holzmann, G.J.: The Spin Model Checker: Primer and Reference Manual. Addison-Wesley, Boston (2004)Google Scholar
  9. 9.
    Holzmann, G.J., Kupferman, O.: Not checking for closure under stuttering. In: Grégoire, J.C., Holzmann, G.J., Peled, D.A. (eds.) The SPIN Verification System. DIMACS Series in Discrete Mathematics and Theoretical Computer Science, vol. 32, pp. 17–22. American Mathematical Society (1997)Google Scholar
  10. 10.
    Holzmann, G.J., Peled, D.: An improvement in formal verification. In: Hogrefe, D., Leue, S. (eds.) Proceedings of the 7th IFIP WG6.1 International Conference on Formal Description Techniques (Forte 1994). IFIP Conference Proceedings, vol. 6, pp. 197–211. Chapman & Hall (1995).
  11. 11.
    Jackson, D.: Software Abstractions: Logic, Language, and Analysis, Revised edn. MIT Press (2012)Google Scholar
  12. 12.
    Peled, D.: Combining partial order reductions with on-the-fly model-checking. In: Dill, D.L. (ed.) CAV 1994. LNCS, vol. 818, pp. 377–390. Springer, Heidelberg (1994). Scholar
  13. 13.
    Peled, D.: Combining partial order reductions with on-the-fly model-checking. Form. Methods Syst. Des. 8(1), 39–64 (1996). Scholar
  14. 14.
    Schwoon, S., Esparza, J.: A note on on-the-fly verification algorithms. In: Halbwachs, N., Zuck, L.D. (eds.) TACAS 2005. LNCS, vol. 3440, pp. 174–190. Springer, Heidelberg (2005). Scholar
  15. 15.
    Siegel, S.F.: Reexamining two results in partial order reduction. Technical report. UD-CIS-2011/06, U. Delaware (2011).
  16. 16.
    Siegel, S.F.: Transparent partial order reduction. Form. Methods Syst. Des. 40(1), 1–19 (2012). Scholar
  17. 17.
    Siegel, S.F.: What’s wrong with on-the-fly partial order reduction (extended version). Technical report. UD-CIS-2019/05, University of Delaware (2019).

Copyright information

© The Author(s) 2019

Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

Authors and Affiliations

  1. 1.University of DelawareNewarkUSA

Personalised recommendations