1 Verification Approach and Software Architecture

SymDIVINE is a model checker that primarily aims for verification of parallel C and C++ programs. In contrast to explicit-state model checker [1], SymDIVINE represents data values symbolically and can therefore handle programs with inputs, which would otherwise cause state-space explosion due to the number of possible input values. In particular, SymDIVINE uses the control-explicit data-symbolic (ceds) approach to model checking in which control-flow of the program is represented explicitly and values of data structures are represented symbolically [1, 7].

We now describe the approach in more detail. In a ceds model checker, each generated state is a triple that contains a control part (program counter for each thread), explicit data storage, and symbolic data storage. The explicit data storage keeps values of constants and of variables whose values are uniquely determined. The symbolic data storage represents a set of possible values of program variables by a first-order formula in the theory of bit-vectors. To generate the state space, SymDIVINE explores all possible evaluations of the program and tracks the effect of program instructions on the explicit values and on the formula representing the symbolic values. To avoid exploring infeasible paths, an smt solver is used to check satisfiability of the formula representing the data values. The current version of SymDIVINE relies on the smt solver Z3 [6]. For purposes of the competition we used version 4.4.1. Additionally, in order to avoid generating unnecessary thread interleavings, SymDIVINE collapses steps invisible to other threads into a single transition using the \(\tau \)-reduction algorithm [3].

In addition to verification of safety properties, SymDIVINE also supports verification of properties specified in ltl. To check such properties, SymDIVINE uses standard ltl model checking algorithms based on detection of accepting cycles in the product of the program with the Büchi automaton. However, in order to detect accepting cycles, SymDIVINE has to be able to test states for equality. The equality of states is represented as a quantified bit-vector formula, which is handed to an smt solver [1]. Use of the state equality test can also reduce the state space, as the same state is represented and explored only once. On the other hand, the equality test requires potentially expensive quantified smt reasoning.

To increase performance of SymDIVINE, we added several optimizations in the latest version. The first one, state slicing, is a new method of state representation. In this representation, the symbolic part of the state is represented by multiple independent formulas that describe sets of variables that do not affect each other. This allows for more efficient emptiness and equality tests as query results can be cached and smaller queries (related only to the changed program variables) can be issued. Moreover, the issued queries are usually smaller and can, in many cases, be handled by internal SymDIVINE optimizations, like checking for the syntactic equality of formulas, without the need to query the smt solver. The motivation for state slicing comes from the observation regarding the verified llvm bitcode. As the llvm bitcode is in the single static assignment (ssa) form, individual instructions usually affect only a few variables. These local changes are often independent of the rest of the state. This is not just the case for concurrent programs, but also for sequential programs containing repeated function calls or non-trivial loops. We have also implemented caching, which can leverage the decomposition of the issued smt queries to independent parts.

The second optimization is the integration of formula simplifications based on elimination of unconstrained variables [4, 5] (i.e. variables that occur only once in the formula) from quantified bit-vector formulas. The effectivity of such simplifications also follows from the ssa form of llvm: the formulas generated by SymDIVINE often contain many unconstrained variables. Although the elimination of unconstrained variables in quantifier-free formulas is provided by standard smt solvers, we have extended the approach to quantified formulas, which is necessary for equality queries generated by SymDIVINE. Therefore, we have implemented our own elimination of unconstrained variables from quantified bit-vector formulas in SymDIVINE.

From the implementation point of view, SymDIVINE can be seen as three components – an llvm interpreter, a state representation and an exploration algorithm. The algorithm uses the interpreter to produce successors of each state and uses emptiness and equality tests provided by the state representation to detect empty (unreachable) or already visited states. An overview of this architecture can be seen in Fig. 1. In the picture, the smt store refers to the original storage of states and the partial store refers to the newly implemented storage using state slicing. Both storages are available and users can use whichever they prefer. The entire tool is written in C++ and leverages the llvm framework. Thanks to the well-defined interface, each of the three main components is easily interchangeable.

Fig. 1.
figure 1

High-level overview of the SymDIVINE architecture. Nested boxes correspond to interfaces and their concrete implementations.

2 Strengths and Weaknesses

The main strength of the approach is its universality: although it is aimed at parallel programs, SymDIVINE is applicable to all competition categories except termination, heap manipulation and overflows. SymDIVINE can also verify programs in multiple programming languages, as it uses the llvm bitcode as the input format.

SymDIVINE is also precise: it can find every race condition in the program regardless of the necessary number of context switches, and thanks to the symbolic representation in the bit-vector theory, the verification is also bit-precise. Moreover, unlike symbolic execution or bounded model checkers, SymDIVINE also handles programs with infinite behaviour provided that their state space is finite. The usage of the llvm infrastructure allows to precisely capture compiler optimizations and architecture-specific issues such as the bit width of variables.

On the other hand, the approach does not deal well with loops with number of iterations dependent on an input. In the worst-case scenario, SymDIVINE unrolls the cycle completely, resulting in an enormous state space. SymDIVINE also cannot handle programs that spawn an infinite number of threads or allocate memory from the heap. Support for other smt solvers is not currently implemented in SymDIVINE.

3 Tool Setup and Configuration

In order to run SymDIVINE, libboost-graph, Z3 and clang-3.5 have to be installed. If ltl model checking is requested, ltl2tgba is also required.

A prebuilt package of the tool (version 0.5) can be downloaded from a GitHub releaseFootnote 1. The archive contains binaries for SymDIVINE and also a run script that eases the process of verification by automatically compiling C/C++ files to the llvm bitcode. To verify a C program, run run_symdivine <symdivine_dir> [options] <benchmark>, where <symdivine_dir> is a directory in which the SymDIVINE executable is located. All available options can be listed by using the switch --help. We decided to opt-out from categories Arrays, BitVectors, Heap Data Structures and Floats. The tool should be run with options --fix_volatile --fix_inline --silent -Os.

4 Software Project and Contributors

SymDIVINE source code can be found on GitHubFootnote 2 under the MIT License. The tool is developed at the Faculty of Informatics, Masaryk University, and includes contributions by the authors of this paper, Petr Bauch, and Vojtěch Havel.