Automated Hypersafety Verification
Abstract
We propose an automated verification technique for hypersafety properties, which express sets of valid interrelations between multiple finite runs of a program. The key observation is that constructing a proof for a small representative set of the runs of the product program (i.e. the product of the several copies of the program by itself), called a reduction, is sufficient to formally prove the hypersafety property about the program. We propose an algorithm based on a counterexampleguided refinement loop that simultaneously searches for a reduction and a proof of the correctness for the reduction. We demonstrate that our tool Weaver is very effective in verifying a diverse array of hypersafety properties for a diverse class of input programs.
1 Introduction
A hypersafety property describes the set of valid interrelations between multiple finite runs of a program. A ksafety property [7] is a program safety property whose violation is witnessed by at least k finite runs of a program. Determinism is an example of such a property: nondeterminism can only be witnessed by two runs of the program on the same input which produce two different outputs. This makes determinism an instance of a 2safety property.
The vast majority of existing program verification methodologies are geared towards verifying standard (1)safety properties. This paper proposes an approach to automatically reduce verification of ksafety to verification of 1safety, and hence a way to leverage existing safety verification techniques for hypersafety verification. The most straightforward way to do this is via selfcomposition [5], where verification is performed on k memorydisjoint copies of the program, sequentially composed one after another. Unfortunately, the proofs in these cases are often very verbose, since the full functionality of each copy has to be captured by the proof. Moreover, when it comes to automated verification, the invariants required to verify such programs are often well beyond the capabilities of modern solvers [26] even for very simple programs and properties.
The more practical approach, which is typically used in manual or automated proofs of such properties, is to compose k memorydisjoint copies of the program in parallel (instead of in sequence), and then verify some reduced program obtained by removing redundant traces from the program formed in the previous step. This parallel product program can have many such reductions. For example, the program formed from sequential selfcomposition is one such reduction of the parallel product program. Therefore, care must be taken to choose a “good” reduction that admits a simple proof. Many existing approaches limit themselves to a narrow class of reductions, such as the one where each copy of the program executes in lockstep [3, 10, 24], or define a general class of reductions, but do not provide algorithms with guarantees of covering the entire class [4, 24].
We propose a solution that combines the search for a safety proof with the search for an appropriate reduction, in a counterexamplebased refinement loop. Instead of settling on a single reduction in advance, we try to verify the entire (possibly infinite) set of reductions simultaneously and terminate as soon as some reduction is successfully verified. If the proof is not currently strong enough to cover at least one of the represented program reductions, then an appropriate set of counterexamples are generated that guarantee progress towards a proof.
Our solution is languagetheoretic. We propose a way to represent sets of reductions using infinite tree automata. The standard safety proofs are also represented using the same automata, which have the desired closure properties. This allows us to check if a candidate proof is in fact a proof for one of the represented program reductions, with reasonable efficiency.

We present a counterexampleguided refinement loop that simultaneously searches for a proof and a program reduction in Sect. 7. This refinement loop relies on an efficient algorithm for proof checking based on the antichain method of [8], and strong theoretical progress guarantees.

We propose an automatabased approach to representing a class of program reductions for ksafety verification. In Sect. 5 we describe the precise class of automata we use and show how their use leads to an effective proof checking algorithm incorporated in our refinement loop.

We demonstrate the efficacy of our approach in proving hypersafety properties of sequential and concurrent benchmarks in Sect. 8.
2 Illustrative Example
In Fig. 1(ii), the parallel composition of Mult with two copies of itself is illustrated. The product program is formed for the purpose of proving distributivity, which can be encoded through the postcondition \(x_1 = x_2 + x_3\). Since a, b, and c are not modified in the program, the same variables are used across all copies. One way to prove Mult is distributive is to come up with an inductive invariant \(\phi _{ijk}\) for each location in the product program, represented by a triple of program locations \((\ell _i, \ell _j, \ell _k)\), such that \( true \implies \phi _{111}\) and \(\phi _{666} \implies x_1 = x_2 + x_3\). The main difficulty lies in finding assignments for locations such as \(\phi _{611}\) that are points in the execution of the program where one thread has finished executing and the next one is starting. For example, at \((\ell _6, \ell _1, \ell _1)\) we need the assignment \(\phi _{611} \leftarrow x_1 = (a + b) * c\) which is nonlinear. However, the program given in Fig. 1(ii) can be verified with simpler (linear) reasoning.
The program on the right is a semantically equivalent reduction of the full composition of Fig. 1(ii). Consider the program P = (Copy 1  (Copy 2; Copy 3)). The program on the right is equivalent to a lockstep execution of the two parallel components of P. The validity of this reduction is derived from the fact that the statements in each thread are independent of the statements in the other. That is, reordering the statements of different threads in an execution leads to an equivalent execution. It is easy to see that \(x_1 = x_2 + x_3\) is an invariant of both while loops in the reduced program, and therefore, linear reasoning is sufficient to prove the postcondition for this program. Conceptually, this reduction (and its soundness proof) together with the proof of correctness for the reduced program constitute a proof that the original program Mult is distributive. Our proposed approach can come up with reductions like this and their corresponding proofs fully automatically. Note that a lockstep reduction of the program in Fig. 1(ii) would not yield a solution for this problem and therefore the discovery of the right reduction is an integral part of the solution.
3 Programs and Proofs
A nondeterministic finite automaton (NFA) is a tuple \(A = (Q, \varSigma , \delta , q_0, F)\) where Q is a finite set of states, \(\varSigma \) is a finite alphabet, \(\delta \subseteq Q \times \varSigma \times Q\) is the transition relation, \(q_0 \in Q\) is the initial state, and \(F \subseteq Q\) is the set of final states. A deterministic finite automaton (DFA) is an NFA whose transition relation is a function \(\delta : Q \times \varSigma \rightarrow Q\). The language of an NFA or DFA A is denoted \(\mathcal {L}_{}(A)\), which is defined in the standard way [18].
3.1 Program Traces
\(\mathcal {S}t\) denotes the (possibly infinite) set of program states. For example, a program with two integer variables has \(\mathcal {S}t= \mathbb {Z} \times \mathbb {Z}\). \(\mathcal {A}\subseteq \mathcal {S}t\) is a (possibly infinite) set of assertions on program states. \(\varSigma \) denotes a finite alphabet of program statements. We refer to a finite string of statements as a (program) trace. For each statement \(a \in \varSigma \) we associate a semantics \(\llbracket a\rrbracket \subseteq \mathcal {S}t\times \mathcal {S}t\) and extend \(\llbracket \rrbracket \) to traces via (relation) composition. A trace \(x \in \varSigma ^*\) is said to be infeasible if \(\llbracket x\rrbracket (\mathcal {S}t) = \emptyset \), where \(\llbracket x\rrbracket (\mathcal {S}t)\) denotes the image of \(\llbracket x\rrbracket \) under \(\mathcal {S}t\). To abstract away from a particular program syntax, we define a program as a regular language of traces. The semantics of a program P is simply the union of the semantics of its traces \(\llbracket P\rrbracket = \bigcup _{x \in P} \llbracket x\rrbracket \). Concretely, one may obtain programs as languages by interpreting their edgelabelled controlflow graphs as DFAs: each vertex in the control flow graph is a state, and each edge in the control flow graph is a transition. The control flow graph entry location is the initial state of the DFA and all its exit locations are final states.
3.2 Safety
There are many equivalent notions of program safety; we use nonreachability. A program P is safe if all traces of P are infeasible, i.e. \(\llbracket P\rrbracket (\mathcal {S}t) = \emptyset \). Standard partial correctness specifications are then represented via a simple encoding. Given a precondition \(\phi \) and a postcondition \(\psi \), the validity of the Hoaretriple \(\{\phi \}P\{\psi \}\) is equivalent to the safety of \([\phi ] \cdot P \cdot [\lnot \psi ]\), where [] is a standard assume statement (or the singleton set containing it), and \(\cdot \) is language concatenation.
Example 3.1
We use determinism as an example of how ksafety can be encoded in the framework defined thus far. If P is a program then determinism of P is equivalent to safety of Open image in new window where \(P_1\) and \(P_2\) are copies of P operating on disjoint variables, Open image in new window is a shuffle product of two languages, and \([\phi ]\) is an assume statement asserting that the variables in each copy of P are equal.
4 Reductions
The set of assertions used for a proof is usually determined by a particular language of assertions, and a safe program may not have a (safety) proof in that particular language. Yet, a subset of the program traces may have a proof in that assertion language. If it can be proven that the subset of program runs that have a safety proof are a faithful representation of all program behaviours (with respect to a given property), then the program is correct. This motivates the notion of program reductions.
Definition 4.1
(semantic reduction). If for programs P and \(P'\), \(P'\) is safe implies that P is safe, then \(P'\) is a semantic reduction of P (written \(P' \preceq P\)).
Proposition 4.2
The proof rule SafeRed2 is sound.
The core contribution of this paper is that it provides an algorithmic solution inspired by the above proof rule. To achieve this, two subproblems are solved: (1) Given a set \(\mathcal {R}\) of reductions of a program P and a candidate proof \(\varPi \), can we check if there exists a reduction \(P' \in \mathcal {R}\) which is covered by the proof \(\varPi \)? In Sect. 5, we propose a new semantic interpretation of an existing notion of infinite tree automata that gives rise to an algorithmic check for this step. (2) Given a program P, is there a general sound set of reductions \(\mathcal {R}\) that be effectively represented to accommodate step (1)? In Sect. 6, we propose a construction of an effective set of reductions, representable by our infinite tree automata, using inspirations from existing partial order reduction techniques [15].
5 Proof Checking
The set \(\varSigma ^*\) can be represented as an infinite tree. Each \(x \in \varSigma ^*\) defines a path to a unique node in the tree: the root node is located at the empty string \(\epsilon \), and for all \(a \in \varSigma \), the node located at xa is a child of the node located at x. Each node is then identified by the string labeling the path leading to it. A language \(L \subseteq \varSigma ^*\) (equivalently, \(L : \varSigma ^* \rightarrow \mathbb {B}\)) can consequently be represented as an infinite tree where the node at each x is labelled with a boolean value \(B \equiv (x \in L)\). An example is given in Fig. 2.
It follows that a set of languages is a set of infinite trees, which can be represented using automata over infinite trees. Looping Tree Automata (LTAs) are a subclass of Büchi Tree Automata where all states are accept states [2]. The class of Looping Tree Automata is closed under intersection and union, and checking emptiness of LTAs is decidable. Unlike Büchi Tree Automata, emptiness can be decided in linear time [2].
Definition 5.1
A Looping Tree Automaton (LTA) over \(\varSigma \)ary, \(\mathbb {B}\)labelled trees is a tuple \(M = (Q, \varDelta , q_0)\) where Q is a finite set of states, \(\varDelta \subseteq Q \times \mathbb {B}\times (\varSigma \rightarrow Q)\) is the transition relation, and \(q_0\) is the initial state.
Intuitively, an LTA \(M = (Q, \varDelta , q_0)\) performs a parallel and depthfirst traversal of an infinite tree L while maintaining some local state. Execution begins at the root \(\epsilon \) from state \(q_0\) and nondeterministically picks a transition \((q_0, B, \sigma ) \in \varDelta \) such that B matches the label at the root of the tree (i.e. \(B = (\epsilon \in L)\)). If no such transition exists, the tree is rejected. Otherwise, M recursively works on each child a from state \(q' = \sigma (a)\) in parallel. This process continues infinitely, and L is accepted if and only if L is never rejected.
Formally, M’s execution over a tree L is characterized by a run \(\delta ^* : \varSigma ^* \rightarrow Q\) where \(\delta ^*(\epsilon ) = q_0\) and Open image in new window for all \(x \in \varSigma ^*\). The set of languages accepted by M is then defined as Open image in new window .
Theorem 5.2
Given an LTA M and a regular language L, it is decidable whether Open image in new window .
The proof, which appears in [14], reduces the problem to deciding whether \(\mathcal {L}_{}(M) \cap \mathcal {P}(L) \ne \emptyset \). LTAs are closed under intersection and have decidable emptiness checks, and the lemma below is the last piece of the puzzle.
Lemma 5.3
If L is a regular language, then \({\mathcal {P}(L)}\) is recognized by an LTA.
Counterexamples. Theorem 5.2 effectively states that proof checking is decidable. For automated verification, beyond checking the validity of a proof, we require counterexamples to fuel the development of the proof when the proof does not check. Note that in the simple case of the proof rule safe, when \(P \not \subseteq \mathcal {L}_{}(\varPi )\) there exists a counterexample trace \(x \in P\) such that \(x \notin \mathcal {L}_{}(\varPi )\).
With our proof rule SafeRed2, things get a bit more complicated. First, note that unlike the classic case (safe), where a failed proof check coincides with the nonemptiness of an intersection check (i.e. \(P \cap \overline{\mathcal {L}_{}(\varPi )} \not = \emptyset \)), in our case, a failed proof check coincides with the emptiness of an intersection check (i.e. \(\mathcal {R} \cap {\mathcal {P}(\mathcal {L}_{}(\varPi ))} = \emptyset \)). The sets \(\mathcal {R}\) and \({\mathcal {P}(\mathcal {L}_{}(\varPi ))}\) are both sets of languages. What does the witness to the emptiness of the intersection look like? Each language member of \(\mathcal {R}\) contains at least one string that does not belong to any of the subsets of our proof language. One can collect all such witness strings to guarantee progress across the board in the next round. However, since LTAs can represent an infinite set of languages, one must take care not end up with an infinite set of counterexamples following this strategy. Fortunately, this will not be the case.
Theorem 5.4
Let M be an LTA and let L be a regular language such that \(P \not \subseteq L\) for all \(P \in \mathcal {L}_{}(M)\). There exists a finite set of counterexamples C such that, for all \(P \in \mathcal {L}_{}(M)\), there exists some \(x \in C\) such that \(x \in P\) and \(x \notin L\).
The proof appears in [14]. This theorem justifies our choice of using LTAs instead of more expressive formalisms such as Büchi Tree Automata. For example, the Büchi Tree Automaton that accepts the language \(\{ \{x\} \mid x \in \varSigma ^* \}\) would give rise to an infinite number of counterexamples with respect to the empty proof (i.e. \(\varPi = \emptyset \)). The finiteness of the counterexample set presents an alternate proof that LTAs are strictly less expressive than Büchi Tree Automata [27].
6 Sleep Set Reductions
We have established so far that (1) a set of assertions gives rise to a regular language proof, and (2) given a regular language proof and a set of program reductions recognizable by an LTA, we can check the program (reductions) against the proof. The last piece of the puzzle is to show that a useful class of program reductions can be expressed using LTAs.
Recall our example from Sect. 2. The reduction we obtain is sound because, for every trace in the full parallelcomposition program, an equivalent trace exists in the reduced program. By equivalent, we mean that one trace can be obtained from the other by swapping independent statements. Such an equivalence is the essence of the theory of Mazurkiewicz traces [9].
We fix a reflexive symmetric dependence relation \(D \subseteq \varSigma \times \varSigma \). For all \(a, b \in \varSigma \), we say that a and b are dependent if \((a, b) \in D\), and say they are independent otherwise. We define \(\sim _D\) as the smallest congruence satisfying \(xaby \sim _D xbay\) for all \(x, y \in \varSigma ^*\) and independent \(a, b \in \varSigma \). The closure of a language \(L \subseteq \varSigma ^*\) with respect to \(\sim _D\) is denoted \([L]_D\). A language L is \(\sim _D\)closed if \(L = [L]_D\). It is worthwhile to note that all input programs considered in this paper correspond to regular languages that are \(\sim _D\)closed.
An equivalence class of \(\sim _D\) is typically called a (Mazurkiewicz) trace. We avoid using this terminology as it conflicts with our definition of traces as strings of statements in Sect. 3.1. We assume D is sound, i.e. \(\llbracket ab\rrbracket = \llbracket ba\rrbracket \) for all independent \(a, b \in \varSigma \).
Definition 6.1
(Dreduction). A program \(P'\) is a Dreduction of a program P, that is \(P' \preceq _D P\), if \([P']_D = P\).
Note that the equivalence relation on programs induced by \(\sim _D\) is a refinement of the semantic equivalence relation used in Definition 4.1.
Lemma 6.2
If \(P' \preceq _D P\) then \(P' \preceq P\).
Ideally, we would like to define an LTA that accepts all Dreductions of a program P, but unfortunately this is not possible in general.
Proposition 6.3
(corollary of Theorem 67 of [9]). For arbitrary regular languages \(L_1, L_2 \in \varSigma ^*\) and relation D, the proposition Open image in new window is undecidable.
The proposition is decidable only when \(\overline{D}\) is transitive, which does not hold for a semantically correct notion of independence for a parallel program encoding a ksafety property, since statements from the same thread are dependent and statements from different program copies are independent. Therefore, we have:
Proposition 6.4
Assume P is a \(\sim _D\)closed program and \(\varPi \) is a proof. The proposition Open image in new window is undecidable.
Consider two statements \(a,b \in \varSigma \) where \((a,b) \not \in D\). Let \(x,y \in \varSigma ^*\) and consider two program runs xaby and xbay. We know \(\llbracket xbay\rrbracket = \llbracket xaby\rrbracket \). If the automaton makes a nondeterministic choice that the successors of xa have been explored, then the successors of xba need not be explored (can be pruned away) as illustrated in Fig. 3. Now assume \((a,c) \in D\), for some \(c \in \varSigma \). When the node xbc is being explored, we can no longer safely ignore atransitions, since the equality \(\llbracket xbcay\rrbracket = \llbracket xabcy\rrbracket \) is not guaranteed. Therefore, the a successor of xbc has to be explored. The nondeterministic choice of what child node to explore is modelled by a choice of order in which we explore each node’s children. Different orders yield different reductions. Reductions are therefore characterized as an assignment \(R : \varSigma ^* \rightarrow \mathcal {L}in(\varSigma )\) from nodes to linear orderings on \(\varSigma \), where \((a, b) \in R(x)\) means we explore child xa after child xb.
Finally, we obtain an actual reduction of a program P from a characterization of a reduction R by removing the ignored nodes from P, i.e. \(P \setminus \mathrm {ignore}_R\).
Lemma 6.5
For all \(R : \varSigma ^* \rightarrow \mathcal {L}in(\varSigma )\), if P is a \(\sim _D\)closed program then \(P \setminus \mathrm {ignore}_R\) is a Dreduction of P.
The set of all such reductions is \(\mathrm {reduce}_D(P) = \{ P \setminus \mathrm {ignore}_R \mid R : \varSigma ^* \rightarrow \mathcal {L}in(\varSigma ) \}\).
Theorem 6.6
For any regular language P, \(\mathrm {reduce}_D(P)\) is accepted by an LTA.
Interestingly, every reduction in \(\mathrm {reduce}_D(P)\) is optimal in the sense that each reduction contains at most one representative of each equivalence class of \(\sim _D\).
Theorem 6.7
Fix some \(P \subseteq \varSigma ^*\) and \(R : \varSigma ^* \rightarrow \mathcal {L}in(\varSigma )\). For all \((x, y) \in P \setminus \mathrm {ignore}_R\), if \(x \sim _D y\) then \(x = y\).
7 Algorithms
Figure 4 illustrates the outline of our verification algorithm. It is a counterexampleguided abstraction refinement loop in the style of [12, 13, 17]. The key difference is that instead of checking whether some proof \(\varPi \) is a proof for the program P, it checks if there exists a reduction of the program P that \(\varPi \) proves correct.
The algorithm relies on an oracle Interpolate that, given a finite set of program traces C, returns a proof \(\varPi '\), if one exists, such that \(C \subseteq \mathcal {L}_{}(\varPi ')\). In our tool, we use Craig interpolation to implement the oracle Interpolate. In general, since program traces are the simplest form of sequential programs (loop and branch free), any automated program prover that can handle proving them may be used.
The results presented in Sects. 5 and 6 give rise to the proof checking sub routine of the algorithm in Fig. 4 (i.e. the light grey test). Given a program DFA \(A_P = (Q_P, \varSigma , \delta _P, q_{P0}, F_P)\) and a proof DFA \(A_\varPi = (Q_\varPi , \varSigma , \delta _\varPi , q_{\varPi 0}, F_\varPi )\) (obtained by determinizing \(\varPi _{NFA}\)), we can decide Open image in new window by constructing an LTA \(M_{P\varPi }\) for \(\mathrm {reduce}_D(\mathcal {L}_{}(A_P)) \cap {\mathcal {P}(\mathcal {L}_{}(A_\varPi ))}\) and checking emptiness (Theorem 5.2).
7.1 Progress
The algorithm corresponding to Fig. 4 satisfies a weak progress theorem: none of the counterexamples from a round of the algorithm will ever appear in a future counterexample set. This, however, is not strong enough to guarantee termination. Alternatively, one can think of the algorithm’s progress as follows. In each round new assertions are discovered through the oracle Interpolate, and one can optimistically hope that one can finally converge on an existing target proof \(\varPi ^*\). The success of this algorithm depends on two factors: (1) the counterexamples used by the algorithm belong to \(\mathcal {L}_{}(\varPi ^*)\) and (2) the proof that Interpolate discovers for these counterexamples coincide with \(\varPi ^*\). The latter is a typical known wild card in software model checking, which cannot be guaranteed; there is plenty of empirical evidence, however, that procedures based on Craig Interpolation do well in approximating it. The former is a new problem for our refinement loop.
In a standard algorithm in the style of [12, 13, 17], the verification proof rule dictates that every program trace must be in \(\mathcal {L}_{}(\varPi ^*)\). In our setting, we only require a subset (corresponding to some reduction) to be in \(\mathcal {L}_{}(\varPi ^*)\). This means one cannot simply rely on program traces as appropriate counterexamples. Theorem 5.4 presents a solution to this problem. It ensures that we always feed Interpolate some counterexample from \(\varPi ^*\) and therefore guarantee progress.
Theorem 7.1
(Strong Progress). Assume a proof \(\varPi ^*\) exists for some reduction \(P^* \in \mathcal {R}\) and Interpolate always returns some subset of \(\varPi ^*\) for traces in \(\mathcal {L}_{}(\varPi ^*)\). Then the algorithm will terminate in at most \(\varPi ^*\) iterations.
Theorem 7.1 ensures that the algorithm will never get into an infinite loop due to a bad choice of counterexamples. The condition on Interpolate ensures that divergence does not occur due to the wrong choice of assertions by Interpolate and without it any standard interpolationbased software model checking algorithm may diverge. The assumption that there exists a proof for a reduction of the program in the fixed set \(\mathcal {R}\) ensures that the proof checking procedure can verify the target proof \(\varPi ^*\) once it is reached. Note that, in general, a proof may exist for a reduction of the program which is not in \(\mathcal {R}\). Therefore, the algorithm is not complete with respect to all reductions, since checking the premises of SafeRed1 is undecidable as discussed in Sect. 4.
7.2 Faster Proof Checking Through Antichains
The state set of \(M_{P\varPi }\), the intersection of program and proof LTAs, has size \(Q_P \times \mathbb {B}\times \mathcal {P}(\varSigma ) \times Q_\varPi \), which is exponential in \(\varSigma \). Therefore, even a linear emptiness test for this LTA can be computationally expensive. Antichains have been previously used [8] to optimize certain operations over NFAs that also suffer from exponential blowups, such as deciding universality and inclusion tests. The main idea is that these operations involve computing downwardsclosed and upwardsclosed sets according to an appropriate subsumption relation, which can be represented compactly as antichains. We employ similar techniques to propose a new emptiness check algorithm.
Antichains. The set of maximal elements of a set X with respect to some ordering relation \(\sqsubseteq \) is denoted \({\text {max}}(X)\). The downwardsclosure of a set X with respect to \(\sqsubseteq \) is denoted \({\lfloor X\rfloor }\). An antichain is a set X where no element of X is related (by \(\sqsubseteq \)) to another. The maximal elements \({\text {max}}(X)\) of a finite set X is an antichain. If X is downwardsclosed then \({\lfloor {\text {max}}(X)\rfloor } = X\).
If \(\mathrm {inactive}(M)\) is downwardsclosed with respect to some subsumption relation \((\sqsubseteq ) \subseteq Q \times Q\), then we need not represent all of \(\mathrm {inactive}(M)\). The antichain \(\max (\mathrm {inactive}(M))\) of maximal elements of \(\mathrm {inactive}(M)\) (with respect to \(\sqsubseteq \)) would be sufficient to represent the entirety of \(\mathrm {inactive}(M)\), and can be exponentially smaller than \(\mathrm {inactive}(M)\), depending on the choice of relation \(\sqsubseteq \).
We are most interested in the state space of \(M_{P\varPi }\), which is \(Q_{P\varPi } = (Q_P \times \mathbb {B}\times {\mathcal {P}(\varSigma )}) \times Q_\varPi \). Observe that states whose \(\mathbb {B}\) part is \(\top \) are always active:
Lemma 7.2
\(((q_P, \top , S), q_\varPi ) \notin \mathrm {inactive}(M_{P\varPi })\) for all \(q_P \in Q_P\), \(q_\varPi \in Q_\varPi \), and \(S \subseteq \varSigma \).
Lemma 7.3
\(F_{M_{P\varPi }}\) preserves downwardsclosedness with respect to \(\sqsubseteq _{P\varPi }\).
Lemma 7.4
A full justification appears in [14]. Formulating \(F^{\max }_{M_{P\varPi }}\) as a higherorder function allows us to calculate \(\max (\mathrm {inactive}(M_{P\varPi }))\) using efficient fixpoint algorithms like the one in [22]. Algorithm 1 outlines our proof checking routine. \(\textsc {Fix} : ((A \rightarrow B) \rightarrow (A \rightarrow B)) \rightarrow (A \rightarrow B)\) is a procedure that computes the least fixpoint of its input. The algorithm simply computes the fixpoint of the function \(F^{\max }_{M_{P\varPi }}\) as defined in Lemma 7.4, which is a compact representation of \(\mathrm {inactive}(M_{P\varPi })\) and checks if the start state of \(M_{P\varPi }\) is in it.
Counterexamples. Theorem 5.4 states that a finite set of counterexamples exists whenever Open image in new window does not hold. The proof of emptiness for an LTA, formed using rule Inactive above, is a finite tree. Each edge in the tree is labelled by an element of \(\varSigma \) (obtained from the existential in the rule) and the paths through this tree form the counterexample set. To compute this set, then, it suffices to remember enough information during the computation of \(\mathrm {inactive}(M)\) to reconstruct the proof tree. Every time a state q is determined to be inactive, we must also record the witness \(a \in \varSigma \) for each transition \((q, B, \sigma ) \in \varDelta \) such that \(\sigma (a) \in \mathrm {inactive}(M)\).
In an antichainbased algorithm, once we determine a state q to be inactive, we simultaneously determine everything it subsumes (i.e. \(\sqsubseteq q\)) to be inactive as well. If we record unique witnesses for each and every state that q subsumes, then the space complexity of our antichain algorithm will be the same as the unoptimized version. The following lemma states that it is sufficient to record witnesses only for q and discard witnesses for states that q subsumes.
Lemma 7.5
Fix some states \(q, q'\) such that \(q' \sqsubseteq _{P\varPi } q\). A witness used to prove q is inactive can also be used to prove \(q'\) is inactive.
Note that this means that the antichain algorithm soundly returns potentially fewer counterexamples than the original one.
7.3 Partition Optimization
The LTA construction for \(\mathrm {reduce}_D(P)\) involves a nondeterministic choice of linear order at each state. Since \(\mathcal {L}in(\varSigma )\) has size \(\varSigma !\), each state in the automaton would have a large number of transitions. As an optimization, our algorithm selects ordering relations out of \(\mathcal {P}art(\varSigma )\) (instead of \(\mathcal {L}in(\varSigma )\)), defined as \(\mathcal {P}art(\varSigma ) = \{ \varSigma _1 \times \varSigma _2 \mid \varSigma _1 \uplus \varSigma _2 = \varSigma \}\) where \(\uplus \) is disjoint union. This leads to a sound algorithm which is not complete with respect to sleep set reductions and trades the factorial complexity of computing \(\mathcal {L}in(\varSigma )\) for an exponential one.
8 Experimental Results
To evaluate our approach, we have implemented our algorithm in a tool called Weaver written in Haskell. Weaver accepts a program written in a simple imperative language as input, where the property is already encoded in the program in the form of assume statements, and attempts to prove the program correct. The dependence relation for each input program is computed using a heuristic that ensures \(\sim _D\)closedness. It is based on the fact that the shuffle product (i.e. parallel composition) of two \(\sim _D\)closed languages is \(\sim _D\)closed.
Weaver employs two verification algorithms: (1) The total order algorithm presented in Algorithm 1, and (2) the variation with the partition optimization discussed in Sect. 7.3. It also implements multiple counterexample generation algorithms: (1) Naive: selects the first counterexample in the difference of the program and proof language. (2) ProgressEnsuring: selects a set of counterexamples satisfying Theorem 5.4. (3) Bounded ProgressEnsuring: selects a few counterexamples (in most cases just one) from the set computed by the progressensuring algorithm. Our experimentation demonstrated that in the vast majority of the cases, the bounded progress ensuring algorithm (an instance of the partition algorithm) is the fastest of all options. Therefore, all our reports in this section are using this instance of the algorithm.
For the larger benchmarks, we use a simple sound optimization to reduce the proof size. We declare the basic blocks of code as atomic, so that internal assertions need not be generated for them as part of the proof. This optimization is incomplete with respect to sleep set reductions.
Benchmarks. We use a set of sequential benchmarks from [24] and include additional sequential benchmarks that involve more interesting reductions in their proofs. We have a set of parallel benchmarks, which are beyond the scope of previous hypersafety verification techniques. We use these benchmarks to demonstrate that our technique/tool can seamlessly handle concurrency. These involve proving concurrency specific hypersafety properties such as determinism and equivalence of parallel and sequential implementations of algorithms. Finally, since the proof checking algorithm is the core contribution of this paper, we have a contrived set of instances to stress test our algorithm. These involve proving determinism of simple paralleldisjoint programs with various numbers of threads and statements per thread. These benchmarks have been designed to cause a combinatorial explosion for the proof checker and counterexample generation routines. More information on the benchmarks can be found in [14].
Evaluation
Experimental results averages for benchmark groups.
Benchmark group  Group count  Proof size  Number of refinement rounds  Proof construction time  Proof checking time  Total time 

Looping programs of [24] 2safety properties  5  63  12  46.69 s  0.1 s  47.03 s 
Looping programs of [24] 3safety properties  8  155  22  475.78 s  11.79 s  448.36 s 
Loopfree programs of [24]  27  5  2  0.13 s  0.0004 s  0.15 s 
Our sequential benchmarks  13  30  9  14.27 s  2.5 s  17.94 s 
Our parallel benchmarks  7  31  8  17.95  0.56 s  18.63 s 
Proof construction time refers to the time spent to construct \(\mathcal {L}_{}(\varPi )\) from a given set of assertions \(\varPi \) and excludes the time to produce proofs for the counterexamples in a given round. Proof checking time is the time spent to check if the current proof candidate is strong enough for a reduction of the program. In the fastest instances (total time around 0.01 s), roughly equal time is spent in proof checking and proof construction. In the slowest instances, the total time is almost entirely spent in proof construction. In contrast, in our stress tests (designed to stress the proof checking algorithm) the majority of the time is spent in proof checking. The time spent in proving counterexamples correct is negligible in all instances. Proof sizes vary from 4 assertions to 298 for the most complicated instance. Verification times are correlated with the final proof size; larger proofs tend to cause longer verification times.
Numbers of refinement rounds vary from 2 for the simplest to 33 for the most complicated instance. A small number of refinement rounds (e.g. 2) implies a fast verification time. But, for the higher number of rounds, a strong positive correlation between the number of rounds and verification time does not exist.
For our parallel programs benchmarks (other than our stress tests), the tool spends the majority of its time in proof construction. Therefore, we designed specific (unusual) parallel programs to stress test the proof checker. Stress test benchmarks are trivial tests of determinism of disjoint parallel programs, which can be proven correct easily by using the atomic block optimization. However, we force the tool to do the unnecessary hard work. These instances simulate the worst case theoretical complexity where the proof checking time and number of counterexamples grow exponentially with the number of threads and the sizes of the threads. In the largest instance, more than 99% of the total verification time is spent in proof checking. Averages are not very informative for these instances, and therefore are not included in Table 1.
Finally, Weaver is only slow for verifying 3safety properties of large looping benchmarks from [24]. Note that unlike the approach in [24], which starts from a default lockstep reduction (that is incidentally sufficient to prove these instances), we do not assume any reduction and consider them all. The extra time is therefore expected when the product programs become quite large.
9 Related Work
The notion of a ksafety hyperproperty was introduced in [7] without consideration for automatic program verification. The approach of reducing ksafety to 1safety by selfcomposition is introduced in [5]. While theoretically complete, selfcomposition is not practical as discussed in Sect. 1. Product programs generalize the selfcomposition approach and have been used in verifying translation validation [20], noninterference [16, 23], and program optimization [25]. A product of two programs \(P_1\) and \(P_2\) is semantically equivalent to \(P_1 \cdot P_2\) (sequential composition), but is made easier to verify by allowing parts of each program to be interleaved. The product programs proposed in [3] allow lockstep interleaving exclusively, but only when the control structures of \(P_1\) and \(P_2\) match. This restriction is lifted in [4] to allow some nonlockstep interleavings. However, the given construction rules are nondeterministic, and the choice of product program is left to the user or a heuristic.
Relational program logics [6, 28] extend traditional program logics to allow reasoning about relational program properties, however automation is usually not addressed. Automatic construction of product programs is discussed in [10] with the goal of supporting procedure specifications and modular reasoning, but is also restricted to lockstep interleavings. Our approach does not support procedure calls but is fully automated and permits nonlockstep interleavings.
The key feature of our approached is the automation of the discovery of an appropriate program reduction and a proof combined. In this case, the only other method that compares is the one based on Cartesian Hoare Logic (CHL) proposed in [24] along with an algorithm for automatic verification based on CHL. Their proposed algorithm implicitly constructs a product program, using a heuristic that favours lockstep executions as much as possible, and then prioritizes certain rules of the logic over the rest. The heuristic nature of the search for the proof means that no characterization of the search space can be given, and no guarantees about whether an appropriate product program will be found. In contrast, we have a formal characterization of the set of explored product programs in this paper. Moreover, CHL was not designed to deal with concurrency.
Lipton [19] first proposed reduction as a way to simplify reasoning about concurrent programs. His ideas have been employed in a semiautomatic setting in [11]. Partialorder reduction (POR) is a class of techniques that reduces the state space of search by removing redundant paths. POR techniques are concerned with finding a single (preferably minimal) reduction of the input program. In contrast, we use the same underlying ideas to explore many program reductions simultaneously. The class of reductions described in Sect. 6 is based on the sleep set technique of Godefroid [15]. Other techniques exist [1, 15] that are used in conjunction with sleep sets to achieve minimality in a normal POR setting. In our setting, reductions generated by sleep sets are already optimal (Theorem 6.7). However, employing these additional POR techniques may propose ways of optimizing our proof checking algorithm by producing a smaller reduction LTA.
References
 1.Abdulla, P.A., Aronis, S., Jonsson, B., Sagonas, K.: Source sets: a foundation for optimal dynamic partial order reduction. J. ACM (JACM) 64(4), 25 (2017)MathSciNetCrossRefGoogle Scholar
 2.Baader, F., Tobies, S.: The inverse method implements the automata approach for modal satisfiability. In: Goré, R., Leitsch, A., Nipkow, T. (eds.) IJCAR 2001. LNCS, vol. 2083, pp. 92–106. Springer, Heidelberg (2001). https://doi.org/10.1007/3540457445_8CrossRefGoogle Scholar
 3.Barthe, G., Crespo, J.M., Kunz, C.: Relational verification using product programs. In: Butler, M., Schulte, W. (eds.) FM 2011. LNCS, vol. 6664, pp. 200–214. Springer, Heidelberg (2011). https://doi.org/10.1007/9783642214370_17CrossRefGoogle Scholar
 4.Barthe, G., Crespo, J.M., Kunz, C.: Beyond 2safety: asymmetric product programs for relational program verification. In: Artemov, S., Nerode, A. (eds.) LFCS 2013. LNCS, vol. 7734, pp. 29–43. Springer, Heidelberg (2013). https://doi.org/10.1007/9783642357220_3CrossRefGoogle Scholar
 5.Barthe, G., D’argenio, P.R., Rezk, T.: Secure information flow by selfcomposition. Math. Struct. Comput. Sci. 21(6), 1207–1252 (2011)Google Scholar
 6.Benton, N.: Simple relational correctness proofs for static analyses and program transformations. In: ACM SIGPLAN Notices, vol. 39, pp. 14–25. ACM (2004)Google Scholar
 7.Clarkson, M.R., Schneider, F.B.: Hyperproperties. In: 21st IEEE Computer Security Foundations Symposium, pp. 51–65. IEEE (2008)Google Scholar
 8.De Wulf, M., Doyen, L., Henzinger, T.A., Raskin, J.F.: Antichains: a new algorithm for checking universality of finite automata. In: Ball, T., Jones, R.B. (eds.) CAV 2006. LNCS, vol. 4144, pp. 17–30. Springer, Heidelberg (2006). https://doi.org/10.1007/11817963_5CrossRefGoogle Scholar
 9.Diekert, V., Métivier, Y.: Partial commutation and traces. In: Rozenberg, G., Salomaa, A. (eds.) Handbook of Formal Languages, pp. 457–533. Springer, Heidelberg (1997). https://doi.org/10.1007/9783642591266_8CrossRefGoogle Scholar
 10.Eilers, M., Müller, P., Hitz, S.: Modular product programs. In: Ahmed, A. (ed.) ESOP 2018. LNCS, vol. 10801, pp. 502–529. Springer, Cham (2018). https://doi.org/10.1007/9783319898841_18CrossRefGoogle Scholar
 11.Elmas, T., Qadeer, S., Tasiran, S.: A calculus of atomic actions. In: ACM SIGPLAN Notices, vol. 44, pp. 2–15. ACM (2009)Google Scholar
 12.Farzan, A., Kincaid, Z., Podelski, A.: Inductive data flow graphs. In: ACM SIGPLAN Notices, vol. 48, pp. 129–142. ACM (2013)Google Scholar
 13.Farzan, A., Kincaid, Z., Podelski, A.: Proof spaces for unbounded parallelism. In: ACM SIGPLAN Notices, vol. 50, pp. 407–420. ACM (2015)Google Scholar
 14.Farzan, A., Vandikas, A.: Reductions for automated hypersafety verification (2019)Google Scholar
 15.Godefroid, P. (ed.): PartialOrder Methods for the Verification of Concurrent Systems: An Approach to the StateExplosion Problem, vol. 1032. Springer, Heidelberg (1996). https://doi.org/10.1007/3540607617CrossRefzbMATHGoogle Scholar
 16.Goguen, J.A., Meseguer, J.: Security policies and security models. In: 1982 IEEE Symposium on Security and Privacy, p. 11. IEEE (1982)Google Scholar
 17.Heizmann, M., Hoenicke, J., Podelski, A.: Refinement of trace abstraction. In: Palsberg, J., Su, Z. (eds.) SAS 2009. LNCS, vol. 5673, pp. 69–85. Springer, Heidelberg (2009). https://doi.org/10.1007/9783642032370_7CrossRefGoogle Scholar
 18.Hopcroft, J.E., Motwani, R., Ullman, J.D.: Introduction to Automata Theory, Languages, and Computation, 3rd edn. AddisonWesley Longman Publishing Co. Inc., Boston (2006)zbMATHGoogle Scholar
 19.Lipton, R.J.: Reduction: a method of proving properties of parallel programs. Commun. ACM 18(12), 717–721 (1975)MathSciNetCrossRefGoogle Scholar
 20.Pnueli, A., Siegel, M., Singerman, E.: Translation validation. In: Steffen, B. (ed.) TACAS 1998. LNCS, vol. 1384, pp. 151–166. Springer, Heidelberg (1998). https://doi.org/10.1007/BFb0054170CrossRefGoogle Scholar
 21.Popeea, C., Rybalchenko, A., Wilhelm, A.: Reduction for compositional verification of multithreaded programs. In: Formal Methods in ComputerAided Design (FMCAD), 2014, pp. 187–194. IEEE (2014)Google Scholar
 22.Pottier, F.: Lazy least fixed points in ML (2009) Google Scholar
 23.Sabelfeld, A., Myers, A.C.: Languagebased informationflow security. IEEE J. Sel. Areas Commun. 21(1), 5–19 (2003)CrossRefGoogle Scholar
 24.Sousa, M., Dillig, I.: Cartesian hoare logic for verifying ksafety properties. In: ACM SIGPLAN Notices, vol. 51, pp. 57–69. ACM (2016)Google Scholar
 25.Sousa, M., Dillig, I., Vytiniotis, D., Dillig, T., Gkantsidis, C.: Consolidation of queries with userdefined functions. In: ACM SIGPLAN Notices, vol. 49, pp. 554–564. ACM (2014)Google Scholar
 26.Terauchi, T., Aiken, A.: Secure information flow as a safety problem. In: Hankin, C., Siveroni, I. (eds.) SAS 2005. LNCS, vol. 3672, pp. 352–367. Springer, Heidelberg (2005). https://doi.org/10.1007/11547662_24CrossRefGoogle Scholar
 27.Vardi, M.Y., Wolper, P.: Reasoning about infinite computations. Inf. Comput. 115(1), 1–37 (1994)MathSciNetCrossRefGoogle Scholar
 28.Yang, H.: Relational separation logic. Theor. Comput. Sci. 375(1–3), 308–334 (2007)MathSciNetCrossRefGoogle Scholar
Copyright information
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.