Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

figure a

1 Introduction

The verification of relational program specifications is of wide interest, having many applications. Relational specifications can describe multiple runs of the same program or relate the behaviors of multiple programs. An example of the former is the verification of security properties such as non-interference, where different executions of the same program are compared to check whether there is a leak of sensitive information. The latter is useful for checking equivalence or refinement relationships between programs after applying some transformations or during iterative development of different software versions.

There is a rich history of work on the relational verification of programs. Representative efforts include those that target general analysis using relational program logics and frameworks [4, 5, 8, 27, 31] or specific applications such as security verification [1, 7, 9], compiler validation [16, 32], and differential program analysis [17, 19, 21,22,23]. These efforts are supported by tools that range from automatic verifiers to interactive theorem-provers. In particular, many automatic verifiers are based on constructing a composition over the programs under consideration, where the relational property over multiple runs (of the same or different programs) is translated into a functional property over a single run of a composed program. This has the benefit that standard techniques and tools for program verification can then be applied.

However, it is also well known that a naively composed program can lead to difficult verification problems for automatic verifiers. For example, a sequential composition of two loops would require effective techniques for generating loop invariants. In contrast, a parallel composition would provide potential for aligning the loop bodies, where relational invariants may be easier to establish than a functional loop invariant. Examples of techniques that exploit opportunities for such alignment include use of type-based analysis with self-composition [29], allowing flexibility in composition to be a mix of sequential and parallel [6], exploiting structurally equivalent programs for compiler validation [32], lockstep execution of loops in reasoning using Cartesian Hoare Logic [27], and merging Horn clause rules for relational verification [13, 24].

In this paper, we present a compositional framework that leverages relational specifications to further simplify the generated verification tasks on the composed program. Our framework is motivated by two main strategies. The first strategy, similar to the efforts mentioned above, is to exploit opportunities for synchrony, i.e., aligning code fragments across which relational invariants are easy to derive, perhaps due to functional similarity or due to similar code structure, etc. Specifically, we choose to synchronize the programs at conditional blocks as well as at loops. Similar to closely related efforts [6, 27], we would like to execute loops in lockstep so that relational invariants can be derived over corresponding iterations over the loop bodies. Specifically, we propose a novel technique that analyzes the relational specifications to infer, under reasonable assumptions, maximal sets of loops that can be executed in lockstep. Synchronizing at conditional blocks in addition to loops enables simplification due to relational specifications and conditional guards that might result in infeasible or redundant subtasks. Pruning of such infeasible subtasks has been performed and noted as important in existing work [27], and synchronizing at conditional blocks allows us to prune eagerly. More importantly, aligning different programs at conditional statements sets up our next strategy.

Our second strategy is the exploitation of symmetry in relational specifications. Due to control flow divergences or non-lockstep executions of loops, even different copies of the same program may proceed along different code fragments. However, some of the resulting verification subtasks may be indistinguishable from each other due to underlying symmetries among related fragments. We analyze the relational specifications, expressed as formulas in first-order theories (e.g., linear integer arithmetic) with multi-index variables, to discover symmetries and exploit them to prune away redundant subtasks. Prior works on use of symmetry in model checking [11, 14, 15, 20] are typically based on symmetric states satisfying the same set of indexed atomic propositions, and do not consider symmetries among different indices in specifications. To the best of our knowledge, ours is the first work to extract such symmetries in relational specifications, and to use them for pruning redundant subtasks during relational verification. For extracting these symmetries, we have lifted core ideas from symmetry-discovery and symmetry-breaking in SAT formulas [12] to richer formulas in first-order theories.

The strategies we propose for exploiting synchrony and symmetry via relational specifications are fairly general in that they can be employed in various verification methods. We provide a generic logic-based description of these strategies at a high level (Sect. 4), and also describe a specific instantiation in a verification algorithm based on forward analysis that computes strongest-postconditions (Sect. 5). We have implemented our approach in a prototype tool called Synonym built on top of the Descartes tool [27]. Our experimental evaluation (Sect. 6) shows the effectiveness of our approach in improving the performance of verification in many examples (and a marginal overhead in smaller examples). In particular, exploiting symmetry is crucial in enabling verification to complete for some properties, without which Descartes exceeds a timeout on all benchmark examples.

2 Motivating Example

Consider three C-like integer programs \(\{P_j\}\) of the form shown in Fig. 1 (left). They are identical modulo renaming, and we use indices \(j \in \{1,2,3\}\) as subscripts to denote variables in the different copies. We assume that each variable initially takes a nondeterministic value in each program.

Fig. 1.
figure 1

Example program (left), and eight possible control-flow decisions (right).

A relational verification problem (RVP) is a tuple consisting of programs \(\{P_j\}\), a relational precondition \( pre \), and a relational postcondition \( post \). In the example RVPs below, we consider the three conditionals, which in turn lead to eight possible control-flow decisions (Fig. 1, right) in a composed program. Each RVP reduces to subproblems for proving that \( post \) can be derived from \( pre \) for each of these control-flow decisions. In the rest of the section, we demonstrate the underlying ideas behind our approach to solve these subproblems efficiently.

Maximizing Lockstep Execution. Given an RVP (referred to as \( RVP _1\)) with precondition \(x_1 < x_3 \wedge x_1> 0 \wedge i_1 > 0 \wedge i_2 \ge i_1 \wedge i_1 = i_3\) (\( pre \)) and postcondition \((x_1 < x_3 \vee y_1 \not = y_3) \wedge i_1 > 0 \wedge i_2 \ge i_1 \wedge i_1 = i_3\) (\( post \)), consider a control-flow decision \(y_1> 20 \wedge y_2> 20 \wedge y_3 > 20\). This leads to another RVP, consisting of three programs of the following form:

where \(j \in \{1,2,3\}\), and the aforementioned \( pre \) and \( post \). From \( pre \), it follows that \(i_1 = i_3\) and \(i_2 \ge i_1\). We can thus infer that the first and third loops are always executed the same number of times, while the second loop may be executed for fewer iterations. This knowledge lets us infer a single relational invariant for the first and third loops and handle the second loop separately. Clearly, the relational invariant \(x_1 < x_3 \wedge i_1 = i_3 \wedge i_1 \le 10\) and the non-relational invariant \(i_2 \le 10\) are enough to derive \( post \). If we were to handle the first and third loop separately, we would need complex nonlinear invariants such as \(x_1 = \frac{x_{1, init } \times i_1!}{i_{1, init }!}\) and \(x_3 = \frac{x_{3, init } \times i_3!}{i_{3, init }!}\), which involve auxiliary variables \(x_{j, init }\) and \(i_{j, init }\) denoting the initial values of \(x_j\) and \(i_j\) respectively.

Symmetry-Breaking. For the same program, and an RVP (referred to as \( RVP _2\)) with precondition \(i_1 > 0 \wedge i_2 \ge i_1 \wedge i_1 = i_3\) and postcondition \(i_1 > 0 \wedge i_2 \ge i_1 \wedge i_1 = i_3\), consider a control-flow decision \(y_1> 20 \wedge y_2 > 20 \wedge y_3 \le 20\). We generate another RVP involving the following set of programs:

Similarly, decision \(y_1 \le 20 \wedge y_2> 20 \wedge y_3 > 20\) generates yet another RVP over the following:

Both RVPs have the same precondition and postcondition as \( RVP _2\). We can see that both RVPs differ only in their subscripts; by taking one and swapping the subscripts 1 and 3 due to symmetry, we arrive at the other. Thus, knowing the verification result for either RVP allows us to skip verifying the other one, by discovering and exploiting such symmetries.

3 Background and Notation

Given a loop-free program over input variables and output variables (such that and are disjoint), let denote its symbolic encoding.

Proposition 1

Given two loop-free programs, \( Tr _1(\vec {x}_1, \vec {y}_1)\) and \( Tr _2(\vec {x}_2, \vec {y}_2)\), a precondition \( pre (\vec {x}_1, \vec {x}_2)\), and a postcondition \( post (\vec {y}_1, \vec {y}_2)\), the task of relational verification is reduced to checking validity of the following formula.

$$\begin{aligned} pre (\vec {x}_1, \vec {x}_2) \wedge Tr _1(\vec {x}_1, \vec {y}_1) \wedge Tr _2(\vec {x}_2, \vec {y}_2) \!\!\implies \!\! post (\vec {y}_1, \vec {y}_2) \end{aligned}$$

Given a program with one loop (i.e., a transition system) over input variables and output variables , let denote a symbolic encoding of the block of code before the loop, denote the loop guard, and encode the loop body. Here, is the vector of local variables that are live at the loop guard. For example, consider the program from our motivating example:

In its encoding, \(\vec {x}= \vec {u}= (i_1, x_1,y_1)\), \(\vec {y}= (i'_1, x'_1)\), , , and .

Proposition 2

(Naive parallel composition). Given two loopy programs, \(\langle Init (\vec {x}_1,\vec {u}_1), Guard (\vec {u}_1), Tr (\vec {u}_1, \vec {y}_1) \rangle \) and \(\langle Init (\vec {x}_2,\vec {u}_2), Guard (\vec {u}_2), Tr (\vec {u}_2, \vec {y}_2) \rangle \), a precondition \( pre (\vec {x}_1, \vec {x}_2)\), and a postcondition \( post (\vec {y}_1, \vec {y}_2)\), the task of relational verification is reduced to the task of finding (individual) inductive invariants \(\varvec{I_1}\) and \(\varvec{I_2}\):

$$\begin{aligned} pre (\vec {x}_1, \vec {x}_2) \wedge Init (\vec {x}_1,\vec {u}_1)&\!\!\implies \!\! \varvec{I_1}(\vec {u}_1) \\ pre (\vec {x}_1, \vec {x}_2) \wedge Init (\vec {x}_2,\vec {u}_2)&\!\!\implies \!\! \varvec{I_2}(\vec {u}_2) \\\varvec{I_1}(\vec {u}_1) \wedge Guard _1(\vec {u}_1) \wedge Tr _1 (\vec {u}_1, \vec {y}_1)&\!\!\implies \!\! \varvec{I_1}(\vec {y}_1) \\\varvec{I_2}(\vec {u}_1) \wedge Guard _2(\vec {u}_2) \wedge Tr _2 (\vec {u}_2, \vec {y}_2)&\!\!\implies \!\! \varvec{I_2}(\vec {y}_2) \\\varvec{I_1}(\vec {y}_1) \wedge \varvec{I_2}(\vec {y}_2) \wedge \lnot Guard _1(\vec {y}_1) \wedge \lnot Guard _2(\vec {y}_2)&\!\!\implies \!\! post (\vec {y}_1,\vec {y}_2) \end{aligned}$$

Note that the method of naive composition requires handling of multiple invariants, which is known to be difficult. Furthermore, it might lose some important relational information specified in \( pre (\vec {x}_1, \vec {x}_2)\). One way to avoid this is to exploit the fact that loops could be executed in lockstep.

Proposition 3

(Lockstep composition). Given two loopy programs, \(\langle Init (\vec {x}_1,\vec {u}_1), Guard (\vec {u}_1), Tr (\vec {u}_1, \vec {y}_1) \rangle \) and \(\langle Init (\vec {x}_2,\vec {u}_2), Guard (\vec {u}_2), Tr (\vec {u}_2, \vec {y}_2) \rangle \), a precondition \( pre (\vec {x}_1, \vec {x}_2)\), and a postcondition \( post (\vec {y}_1, \vec {y}_2)\). Let both loops iterate exactly the same number of times, then the task of relational verification is reduced to the task of finding one (relational) inductive invariant \(\varvec{I}\):

$$\begin{aligned} pre (\vec {x}_1, \vec {x}_2) \wedge Init (\vec {x}_1,\vec {u}_1) \wedge Init (\vec {x}_2,\vec {u}_2)&\!\!\implies \!\! \varvec{I}(\vec {u}_1, \vec {u}_2) \\\varvec{I}(\vec {u}_1, \vec {u}_2) \wedge Guard _1(\vec {u}_1) \wedge Tr _1 (\vec {u}_1, \vec {y}_1) \wedge Guard _2(\vec {u}_2) \wedge Tr _2 (\vec {u}_2, \vec {y}_2)&\!\!\implies \!\! \varvec{I}(\vec {y}_1, \vec {y}_2) \\\varvec{I}(\vec {y}_1, \vec {y}_2) \wedge \lnot Guard _1(\vec {y}_1) \wedge \lnot Guard _2(\vec {y}_2)&\!\!\implies \!\! post (\vec {y}_1,\vec {y}_2) \end{aligned}$$

In this paper, we do not focus on a specific method for deriving these invariants – a plethora of suitable methods have been proposed in the literature, and any of these could be used.

4 Leveraging Relational Specifications

In this section, we describe the main components of our compositional framework where we leverage relational specifications to simplify the verification subtasks. We first describe our novel algorithm for inferring maximal sets of loops that can be executed in lockstep (Sect. 4.1). Next, we describe our technique for handling conditionals (Sect. 4.2). While this is similar to other prior work, the main purpose here is to set the stage for our novel methods for exploiting symmetry (Sect. 4.3).

4.1 Synchronizing Loops

Given a set of loopy programs, we would like to determine which ones can be executed in lockstep. As mentioned earlier, relational invariants over lockstep loops are often easier to derive than loop invariants over a single copy.

Our algorithm CheckLockstep takes as input a set of loopy programs \(\{P_1, \ldots , P_k\}\) and outputs a set of maximal classes of programs that can be executed in lockstep. The algorithm partitions its input set of programs and recursively calls CheckLockstep on the partitions.

First, CheckLockstep infers a relational inductive invariant over the loop bodies, synthesizing \(\varvec{I}(\vec {u}_1,\ldots ,\vec {u}_k)\) in the following:

$$\begin{aligned} pre(\vec {x}_1, \ldots , \vec {x}_k) \wedge \bigwedge _{i = 1}^k Init (\vec {x}_i,\vec {u}_i)&\implies \varvec{I}(\vec {u}_1, \ldots , \vec {u}_k) \\\varvec{I}(\vec {u}_1, \ldots , \vec {u}_k) \wedge \bigwedge _{i = 1}^k Guard _i(\vec {u}_i) \wedge Tr _i (\vec {u}_i, \vec {y}_i)&\implies \varvec{I}(\vec {y}_1, \ldots , \vec {y}_k) \\\end{aligned}$$

CheckLockstep then poses the following query:

$$\begin{aligned} \lnot \Bigg ( \Big (\varvec{I}(\vec {u}_1,\ldots ,\vec {u}_k) \wedge \bigvee _{i = 1}^k \lnot Guard (\vec {u}_i) \Big ) \implies \bigwedge _{i = 1}^k \lnot Guard (\vec {u}_i) \Bigg ) \end{aligned}$$
(1)

The left-hand side of the implication holds whenever one of the loops has terminated (the relational invariant holds, and at least one of the loop conditions must be false), and the right-hand side holds only if all of the loops have terminated. If the formula is unsatisfiable, then the termination of one loop implies the termination of all loops, and all loops can be executed simultaneously [27]. In this case, the entire set of input programs is one maximal class, and the set containing the set of all input programs is returned.

Otherwise, CheckLockstep gets a satisfying assignment and partitions the input programs into a set \( Terminated \) and a set \( Unfinished \). The \( Terminated \) set contains all programs \(P_i\) whose guards \( Guard (\vec {u}_i)\) are false in the model for the formula, and the \( Unfinished \) set contains the remaining programs. The CheckLockstep algorithm is then called recursively on both \( Terminated \) and \( Unfinished \), with its final result being the union of the two sets returned by these recursive calls.

The following theorem assumes that any relational invariant \(\varvec{I}(\vec {u}_1, \ldots , \vec {u}_k)\), generated externally and used by the algorithm, is stronger than any relational invariant \(\varvec{I}(\vec {u}_1, \ldots , \vec {u}_{i-1}, \vec {u}_{i + 1}, \ldots , \vec {u}_k)\) that could be synthesized over the same set of k loops with the \(i^{ th }\) loop removed.

Theorem 1

For any call to CheckLockstep, it always partitions its set of input programs such that for all \(P_i \in Terminated \) and \(P_j \in Unfinished \), \(P_i\) and \(P_j\) cannot be executed in lockstep.

Proof

Assume that CheckLockstep has partitioned its set of programs into the \( Terminated \) and \( Unfinished \) sets. Let \(P_i \in Terminated , P_j \in Unfinished \) be arbitrary programs. Based on how the partitioning is performed, we know that there is a model for Eq. 1 such that \( Guard (\vec {u}_i)\) does not hold and \( Guard (\vec {u}_j)\) does. We can thus conclude that the following formula is satisfiable:

$$\lnot \Big (\varvec{I}(\vec {u}_1,\ldots ,\vec {u}_k) \wedge \lnot Guard (\vec {u}_i)\implies \lnot Guard (\vec {u}_j) \Big )$$

From the assumption on our invariant synthesizer, we conclude that the following is also satisfiable, indicating that \(P_i\) and \(P_j\) cannot be executed in lockstep:

$$\lnot \Big (\varvec{I}(\vec {u}_i,\vec {u}_j) \wedge \lnot Guard (\vec {u}_i)\implies \lnot Guard (\vec {u}_j) \Big )$$

where \(\varvec{I}(\vec {u}_i,\vec {u}_j)\) is the relational invariant for \(P_i\) and \(P_j\) that our invariant synthesizer infers.     \(\square \)

4.2 Synchronizing Conditionals

Let two programs have forms \(\texttt {if~Q}_{i}~\texttt {then~R}_{i}~\texttt {else~S}_{i}\), where \(i \in \{1,2\}\) and \(\texttt {R}_{i}\) and \(\texttt {S}_{i}\) are arbitrary blocks of code and could possibly have loops. Let them be a part of some RVP, which reduces to applying Propositions 12, or 3, depending on the content of each block of code, to four pairs of programs. As we have seen in previous sections, each of the four verification tasks could be expensive. In order to reduce the number of verification tasks where possible, we use the relational preconditions to filter out pairs of programs for which verification conclusions can be derived trivially.

For k programs of the form \(\texttt {if~Q}_{i}~\texttt {then~R}_{i}~\texttt {else~S}_{i}\) for \(i \in \{1, \ldots , k\}\) and precondition \(pre(\vec {x}_1,\ldots ,\vec {x}_k)\), we can simultaneously generate all possible combinations of decisions by querying a solver for all truth assignments to the \(\texttt {Q}_i\)s:

$$\begin{aligned} pre (\vec {x}_1,\ldots ,\vec {x}_k) \wedge \bigwedge _{i = 1}^k Q_i \end{aligned}$$
(2)

We can then use the result of this All-SAT query to generate sets of programs in subtasks. For each assignment j, where each \(Q_i\) is assigned a Boolean value \(v_i\), the following set is generated: \(\{\texttt {assume\,(V}_1\texttt {);}~\texttt {U}_{1}, \ldots ,~\texttt {assume~(V}_\texttt {k}{} \texttt {);}~\texttt {U}_{k}\}\) where for each \(i \in \{1, \ldots , k\}\), if \(v_i = true \), then \(\mathtt{V}_i = \mathtt{Q}_i\) and \(\mathtt{U}_i = \mathtt{R}_i\), else \(\mathtt {V}_i = \lnot \mathtt{Q}_i\) and \(\mathtt{U}_i = \mathtt{S}_i\). We need to apply our verification algorithm on only the resulting sets of programs. For example, in our above RVP, if \(\texttt {Q}_1\) is equivalent to \(\texttt {Q}_2\) in all solutions, then the RVP reduces to verification of just two pairs of programs:

4.3 Discovering and Exploiting Symmetries

Using the All-SAT query from Eq. 2 allows us to prune trivial RVPs. However, as we have seen in Sect. 2, some of the remaining RVPs could be regarded as equivalent due to symmetry. First, we discuss how to identify symmetries in formulas syntactically, and then we show how to use such symmetries.

figure b
Fig. 2.
figure 2

Graph with vertex names (outside the vertices) and colors (inside the vertices).

4.3.1 Identifying Symmetries in Formulas

Formally, symmetries in formulas are defined as permutations. Note that any permutation \(\pi \) of set \(\{1,\ldots ,k\}\) can be lifted to be a permutation of set \(\{\vec {x}_1,\ldots ,\vec {x}_k\}\).

Definition 1

(Symmetry). Let \(\vec {x}_1,\ldots ,\vec {x}_k\) be vectors of the same size over disjoint sets of variables. A symmetry \(\pi \) of a formula \(F(\vec {x}_1,\ldots ,\vec {x}_k)\) is a permutation of set \({\{\vec {x}_i \mid 1 \le i \le k\}}\) such that \(F(\vec {x}_1,\ldots ,\vec {x}_k) \iff F(\pi (\vec {x}_1),\ldots ,\pi (\vec {x}_k))\).

The task of finding symmetries within a set of formulas can be performed syntactically by first canonicalizing the formulas, converting the formulas into a graph representation of their syntax, and then using a graph automorphism algorithm to find the symmetries of the graph. We demonstrate how this can be done for a formula \(\varphi \) over Linear Integer Arithmetic with the following example.

Let \(\varphi = (x_1 \le x_2 \wedge x_3 \le x_4) \wedge ( x_1< z_2 \vee x_3 < z_4)\). Note that this formula is symmetric under a permutation of the subscripts that simultaneously swaps 1 with 3 and 2 with 4. Let \(\{(x_1,z_1), (x_2,z_2), (x_3,z_3), (x_4,z_4)\}\) be the vectors of variables. We identify a vector by its subscript (e.g., we identify \((x_1,z_1)\) by 1).

Our algorithm starts with canonicalizing the formula: \(\varphi = (x_1< x_2 \vee x_1 = x_2) \wedge (x_3< x_4 \vee x_3 = x_4) \wedge (x_1< z_2 \vee x_3 < z_4)\). It then constructs a colored graph for the canonicalized formula with the procedure in Algorithm 1. The algorithm initializes a graph by the set of k vertices \(v^{ Id}_1, \ldots , v^{ Id}_k\) with color \( Id \) (vertices 21–24 in Fig. 2), where k is the number of identifiers. It then (Line 3) adds to the graph the union of the abstract syntax trees (AST) for the formula’s conjuncts, where each vertex has a color corresponding to the type of its AST node. If a parent vertex has a color of an ordering-sensitive operation or predicate, then the children should have colors that include a tag to indicate their ordering (e.g., vertices 9 and 10 in Fig. 2 have colors with tags because their parent has color <, but vertices 11 and 12 do not have tags because their parent has color \(=\)). Next (Line 4), the algorithm performs an appropriate renaming of vertex colors so that each indexed variable name \(x_i\) is replaced with a non-indexed version x, while simultaneously adding edges from each vertex with a renamed color to \(v^{ Id}_i\). The resulting graph for \(\varphi \) is shown in Fig. 2. Finally, the algorithm applies a graph automorphism finder to get the following automorphism (in addition to the identity automorphism), which is shown here in a cyclic notation where (x y) means that \(x \mapsto y\) and \(y \mapsto x\) (vertices that map to themselves are omitted):

$$(0~1)(3~5)(4~6)(7~8)(9~13)(10~14)(11~15)(12~16)(17~19)(18~20)(21~23)(22~24)$$

We are only interested in permutations of the vectors, so we project out the relevant parts of the permutation (21 23)(22 24) and map them back to our vector identifiers to get the following permutation on the identifiers:

$$\pi = \{1 \mapsto 3, 2 \mapsto 4, 3 \mapsto 1, 4 \mapsto 2\}$$

4.3.2 Exploiting Symmetries

We now define the notion of symmetric RVPs and application of symmetry-breaking to generate a single representative per equivalence class of RVPs.

Definition 2

(Symmetric RVPs). Two RVPs: \(\langle Ps, pre (\vec {x}_1, \ldots , \vec {x}_k), post (\vec {y}_1, \ldots , \vec {y}_k) \rangle \) and \(\langle Ps', pre (\vec {x}_1, \ldots , \vec {x}_k), post (\vec {y}_1, \ldots , \vec {y}_k) \rangle \), where \(Ps = \{P_1, \ldots , P_k\}\), and \(Ps' = \{P'_1, \ldots , P'_k\}\), are called symmetric under a permutation \(\pi \) iff

  1. 1.

    \(\pi \) is a symmetry of formula \( pre (\vec {x}_1, \ldots , \vec {x}_k) \wedge post (\vec {y}_1, \ldots , \vec {y}_k)\)

  2. 2.

    for every \(P_i \in Ps\) and \(P_j \in Ps'\), if \(\pi (i) = j\), then \(P_i\) and \(P_j\) have the same number of inputs and outputs and have logically equivalent encodings for the same set of input variables \(\vec {x}_i\) and output variables \(\vec {y}_i\)

As we have seen in Sect. 4.3.1, identification of symmetries could be made purely on the syntactic level of the relational preconditions and postconditions. For each detected symmetry, it remains to check equivalence between the corresponding programs’ encodings, which can be formulated as an SMT problem.

To exploit symmetries, we propose a simple but intuitive approach. First, we identify the set of symmetries using \( pre \wedge post \). Then, we solve the All-SAT query from Eq. 2 and get a reduced set \( R \) of RVPs (i.e., one without all trivial problems). For each \( RVP _i \in R \), we perform the relational verification only if no symmetric \( RVP _j \in R \) has already been verified. Thus, the most expensive part of the routine, checking equivalence of RVPs, is performed on demand and only on a subset of all possible pairs \(\langle RVP _i, RVP _j \rangle \).

Alternatively, in some cases (e.g., for parallelizing the algorithm) it might help to identify all symmetric RVPs prior to solving the All-SAT query from Eq. 2. From this set, we can generate symmetry-breaking predicates (SBPs) [12] and conjoin them to Eq. 2. Constrained with SBPs, this query will have fewer models, and will contain a single representative per equivalence class of RVPs. We describe how to construct SBPs in more detail in the next section.

4.3.3 Generating Symmetry-Breaking Predicates (SBPs)

SBPs have previously been applied in pruning the search space explored by SAT solvers. Traditionally, techniques construct SBPs based on symmetries in truth assignments to the literals in the formula, but SBP-construction can be adapted to be based on symmetries in truth assignments to conditionals, allowing us to break symmetries in our setting.

We can construct an SBP by treating each condition the way a literal is treated in existing SBP constructions. In particular, we can construct the common Lex-Leader SBP used for predicate logic [12], which in our case will force a solver to choose the lexicographically least representative per equivalence class for a particular ordering of the conditions. For the ordering of conditions where \(\mathtt{Q}_i \le \mathtt{Q}_j\) iff \(i \le j\) and a set of symmetries S over \(\{1,\ldots ,k\}\), we can construct a Lex-Leader SBP \(SBP(S) = \bigwedge _{\pi \in S} PP(\pi )\) with the more efficient predicate chaining construction [2], where we have that

$$\begin{aligned} PP(\pi )&= p_{\mathrm {min}(I)} \wedge \bigwedge _{i \in I} p_i \implies g_ prev (i,I) \implies l_i \wedge p_{ next(i,I) } \end{aligned}$$

and that I is the support of \(\pi \) with the last condition for each cycle removed, \(\mathrm {min}(I)\) is the minimal element of I, \( prev (i,I)\) is the maximal element of I still less than i or 0 if there is none, \( next (i,I)\) is the minimal element of I still greater than i or 0 if there is none, \(p_0 = g_0 = true \), \(p_i\) is a fresh predicate for \(i \ne 0\), \(g_i = \mathtt{Q}_{\pi (i)} \implies \mathtt{Q}_{i}\) for \(i \ne 0\), and \(l_i = \mathtt{Q}_i \implies \mathtt{Q}_{\pi (i)}\).

After constructing the SBP, we conjoin it to the All-SAT query in Eq. 2. Our solver now generates sets of programs that, when combined with the relational precondition and postcondition, form a set of irredundant RVPs.

Example. Let us consider how SBPs can be applied to \( RVP _2\) from Sect. 2 to avoid generating two of the eight RVPs we would otherwise generate.

First, we see that our three programs are all copies the same program and are at the same program point, so they will have the same encoding. Next, we find the set of permutations S over \(\{1,2,3\}\) such that for each \(\pi \in S\), we have that \(i_1 > 0 \wedge i_2 \ge i_1 \wedge i_1 = i_3\) iff \(i_{\pi (1)} > 0 \wedge i_{\pi (2)} \ge i_{\pi (1)} \wedge i_{\pi (1)} = i_{\pi (3)}\). In this case, we have that S is the set of permutations \(\{\{1 \mapsto 1, 2 \mapsto 2, 3 \mapsto 3\},\{1 \mapsto 3, 2 \mapsto 2, 3 \mapsto 3\}\}\). Now, we construct a Lex-Leader SBP (using the predicate chaining construction described above):

$$ \begin{array}{l l l} p_1&\wedge&(p_1 \implies ((y_1> 20) \implies (y_2 > 20))) \end{array} $$

where \(p_1\) is a fresh predicate. Conjoining this SBP to Eq. 2, leads to the RVPs arising from the control-flow decisions \(y_1> 20 \wedge y_2 > 20 \wedge y_3 \le 20\) and \(y_1 > 20 \wedge y_2 \le 20 \wedge y_3 \le 20\) no longer being generated.

5 Instantiation of Strategies in Forward Analysis

We now describe an instantiation of our proposed strategies in a verification algorithm based on forward analysis using a strongest-postcondition computation. Other instantiations, e.g., on top of a Horn solver based on Property-Directed Reachability [24] are possible, but outside the scope of this work.

figure c

Given an RVP in the form of a Hoare triple \(\{ Pre \}~P_1 || \cdots || P_k~\{ Post \}\), where || denotes parallel composition, the top-level Verify procedure takes as input the relational specification \( pre = Pre \) and \( post = Post \), the set of input programs \( Current = \{P_1,\ldots ,P_k\}\), and empty sets \( Loops \) and \( Ifs \). It uses a strongest-postcondition computation to compute the next Hoare triple at each step until it can conclude the validity of the original Hoare triple.

Synchronization. Throughout verification, the algorithm maintains three disjoint sets of programs: one for programs that are currently being processed (\( Current \)), one for programs that have been processed up until a loop (\( Loops \)), and one for programs that have been processed up until a conditional statement (\( Ifs \)). The algorithm processes statements in each program independently, with ProcessStatement choosing an arbitrary interleaving of statements from the programs in \( Current \). When the algorithm encounters the end of a program in its call to ProcessStatement, it removes this program from the \( Current \) set. At this point, the algorithm returns safe if the current Hoare triple is proven valid. When a program has reached a point of control-flow divergence and is processed by ProcessStatement, it is removed from \( Current \) and added to the appropriate set (\( Loops \) or \( Ifs \)).

Handling Loops. Once all programs are in the \( Loops \) or \( Ifs \) sets (i.e. \( Current = \varnothing \)), the algorithm handles the programs in the \( Loops \) set if it is nonempty. HandleLoops behaves like CheckLockstep but computes postconditions where possible; when a set of loops are able to be executed in lockstep, HandleLoops computes their postconditions before placing the programs into the \( Terminated \) set. After all loops have been placed in the \( Terminated \) set and a new precondition \( pre '\) has been computed, rather than returning \( Terminated \), HandleLoops invokes \(\textsc {Verify}( pre ', Terminated , Ifs ,\varnothing , post )\).

Handling Conditionals. When \( Current = Loops = \varnothing \), Verify handles conditional statements. HandleIfs exploits symmetries by using the All-SAT query with Lex-Leader SBPs as described in Sect. 4 and calls Verify on each generated verification problem.

6 Implementation and Evaluation

To evaluate the effectiveness of increased lockstep execution of loops and symmetry-breaking, we implemented our algorithm from Sect. 5 on top of the Descartes tool for verifying k-safety properties, i.e., RVPs over k identical Java programs. We implemented two variants: Syn uses only synchrony (i.e., no symmetry is used), while Synonym uses both. All implementations (including Descartes) use the same guess-and-check invariant generator (the same originally used by Descartes, but modified to generate more candidate invariants). In Synonym, we compute symmetries in preconditions and postconditions only when all program copies are the same. For our examples, it sufficed to compute symmetries simply by checking if each possible permutation leads to equivalent formulasFootnote 1. We compare the performance of our prototype implementations to DescartesFootnote 2. We use two metrics for comparison: the time taken and the number of Hoare triples processed by the verification procedure. All experiments were conducted on a MacBook Pro, with a 2.7 GHz Intel Core i5 processor and 8 GB RAM.

6.1 Stackoverflow Benchmarks

The first set of benchmarks we consider are the Stackoverflow benchmarks originally used to evaluate Descartes. These implement (correctly or incorrectly) the Java Comparator or Comparable interface, and check whether or not their compare functions satisfy the following properties:

  • P1: \(\forall x,y. sgn ( compare (x,y)) = - sgn ( compare (y,x))\)

  • P2: \(\forall x,y,z. ( compare (x,y)> 0 \wedge compare (y,z)> 0) \implies compare (x,z) > 0\)

  • P3: \(\forall x,y,z. ( compare (x,y) = 0) \!\implies \! ( sgn ( compare (x,z)) = sgn ( compare (y,z)))\)

(One of the original 34 Stackoverflow examples is excluded from our evaluation here because of the inability of the invariant generator to produce a suitable invariant.) We compare the results of running Syn and Synonym vs. Descartes for each property in Table 1. (Expanded versions and plots of these results are available in an extended version of the paper [26].)

Because property P1 contains a symmetry, we notice an improvement in terms of number of Hoare triples with the use of symmetry for this property; however, the overhead of computing symmetries leads to Synonym performing more slowly than Syn even for some examples that exhibit reduced Hoare triple counts. Property P1 is also the easiest to prove (all implementations can verify each example in under 0.3 s), so the overheads contribute more significantly to the runtime. For examples on which our implementations do not perform as well as Descartes, we perform reasonably closely to Descartes. These examples are typically smaller, and again overheads play a larger role in our poorer performance.

Table 1. Stackoverflow Benchmarks. Total times (in seconds) and Hoare triple counts (HTC) for Stackoverflow benchmarks, where for each property, the results for Syn and Synonym are divided into those for examples where they exhibit a factor of improvement over Descartes that is greater or equal to 1 (top) and those for which they do not (bottom). Improv reports the factor of improvement over Descartes, where the number of examples is given in parentheses.

6.2 Modified Stackoverflow Benchmarks

The original Stackoverflow examples are fairly small, with all implementations taking under 6 s to verify any example. To assess how we perform on larger examples, we modified several of the larger Stackoverflow comparator examples to be longer, take more arguments, and contain more control-flow decisions. The resulting functions take three arguments and pick the “largest” object’s id, where comparison among objects is performed based on the original Stackoverflow example code. (Ties are broken by choosing the least id.) We check whether these pick functions satisfy the following properties that allow reordering input arguments:

  • P13: \(\forall x,y,z. pick (x,y,z) = pick (y,x,z)\)

  • P14: \(\forall x,y,z. pick (x,y,z) = pick (y,x,z) \wedge pick (x,y,z) = pick (z,y,x)\)

Note that P13 allows swapping the first two input arguments, while P14 allows any permutation of inputs, a useful hyperproperty.

The results from running property P13 are shown in Table 2. We see here that for these larger examples, Hoare triple counts are more reliably correlated with the time taken to perform verification. Syn outperforms Descartes on 14 of the 16 examples, and Synonym outperforms both Descartes and Syn on all 16 examples.

The results from running property P14 are shown in Table 3. For this property, note thatDescartes is unable to verify any of the examples within a one-hour timeout. Meanwhile, Syn is able to verify 10 of the 16 examples without exceeding the timeout. Exploiting symmetries here exhibits an obvious improvement, with Synonym not only being able to verify the same examples as Syn, with consistently faster performance on the larger examples, but also being able to verify an additional example within an hour.

Table 2. Verifying P13 for modified Stackoverflow examples. Times (in seconds) and Hoare triple counts (HTC).
Table 3. Verifying P14 for modified Stackoverflow examples. Times (in seconds) and Hoare triple counts (HTC). - indicates that no sufficient invariant could be inferred.

Summary of Experimental Results. Our experiments indicate that our performance improvements are consistent: on all Descartes benchmarks (in Table 1, which are all small) our techniques either have low overhead or show some improvement despite the overhead; and on modified (bigger) programs they lead to significant improvements. In particular, we report (Table 2) speedups up to 21.4x (on an example where the property doesn’t hold) and 4.2x (on an example where it does). More importantly, we report (Table 3) that Descartes times out on 14 examples, where of these Synonym times out for 2 and cannot infer an invariant for one example.

7 Related Work

The work most closely related to ours is by Sousa and Dillig [27], which proposed Cartesian Hoare Logic (CHL) for proving k-safety properties and the tool Descartes for automated reasoning in CHL. In addition to the core program logic, CHL includes additional proof rules for loops, referred to as Cartesian Loop Logic (CLL). A generalization of CHL, called Quantitative Cartesian Hoare Logic was subsequently used by Chen et al. [10] to detect side-channel vulnerabilities in cryptographic implementations.

In terms of comparison, neither CHL nor CLL force alignment at conditional statements or take advantage of symmetries. We believe our algorithm for identifying a maximal set of lockstep loops is also novel and can be used in other methods that do not rely on CHL/CLL. On the other hand, CLL proof rules allow not only fully lockstep loops, but also partially lockstep loops. Although we did not consider it here, our maximal lockstep-loop detection algorithm can be combined with their partial lockstep execution to further improve the efficiency of verification. For example, applying the Fusion 2 rule from CLL to our example while loops generated from \( RVP _1\) (Sect. 2) would result in three subproblems and require reasoning twice about the second copy’s loop finishing later. When combined with maximal lockstep-loop detection, we could generate just two subproblems: one where the first and third loops terminate first, and another where the second loop terminates first.

Other automatic efforts for relational verification typically use some kind of product programs [6, 13, 17, 21, 22, 24, 28], with a possible reduction to Horn solving [13, 17, 21, 24]. Similarly to our strategy for synchrony, most of them attempt to leverage similarity (structural or functional) in programs to ease verification. However, we have seen less focus on leveraging relational specifications themselves to simplify verification tasks, although this varies according to the verification method used. Some efforts do not reason over product programs at all, relying on techniques based on decomposition [3] or customized theories with theorem proving [4, 30] instead. To the best of our knowledge, none of these efforts exploit symmetry in programs or in relational specifications.

On the other hand, symmetry has been used very successfully in model checking parametric finite state systems [11, 15, 20] and concurrent programs [14]. Our work differs from these efforts in two main respects. First, the parametric systems considered in these efforts have components that interact with each other or share variables. Second, the correctness specifications are also parametric, usually single-index or double-index properties in a propositional (temporal) logic. In contrast, in our RVPs, the individual programs are independent and do not share any common variables. The only interaction between them is via relational specifications. Furthermore, we discover symmetries in these relational specifications over multi-index variables, expressed as formulas in first-order theories (e.g., linear integer arithmetic). We then exploit these symmetries to prune redundant RVPs during verification.

There are also some similarities between relational verification and verification of concurrent/parallel programs. In the latter, a typical verifier [18] would use visible operations (i.e., synchronization operations or communication on shared state) as synchronizing points in the composed program. In our work, this selection is made based on the structure of the component programs and the ease of utilizing or deriving relational assertions for the code fragments. Furthermore, one does not need to consider different orderings in interleavings of programs in the RVPs. Since these fragments are independent, it suffices to explore any one ordering. Instead, we exploit symmetries in the relational assertions to prune away redundant RVPs.

Finally, specific applications may impose additional synchrony requirements pertaining to visibility. For example, one may want to check for information leaks from private inputs to public outputs not only at the end of a program but at other specified intermediate points, or information leakage models for side-channel attacks may check for leaks based on given observer models [1]. Such requirements can be viewed as relational specifications at selected synchronizing points in the composed program. Again, we can leverage these relational specifications to simplify the resulting verification subproblems.

8 Conclusions and Future Work

We have proposed novel techniques for improving relational verification, which has several applications including security verification, program equivalence checking, and regression verification. Our two key ideas are maximizing the amount of code that can be synchronized and identifying symmetries in relational specifications to avoid redundant subtasks. Our prototype implementation on top of the Descartes verification tool leads to consistent improvements on a range of benchmarks. In the future, we would be interested in implementing these ideas on top of a Horn-based relational verifier (e.g., [25]) and extending it to work with recursive data structures. We are also interested in developing an algorithm for finding symmetries in formulas that does not rely on an external graph automorphism tool.