ParCoSS: Efficient Parallelized Compiled Symbolic Simulation

Herdt, Vladimir; Le, Hoang M.; Große, Daniel; Drechsler, Rolf

doi:10.1007/978-3-319-41540-6_10

Vladimir Herdt¹⁵,
Hoang M. Le¹⁵,
Daniel Große^15,16 &
…
Rolf Drechsler^15,16

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9780))

Included in the following conference series:

International Conference on Computer Aided Verification

2042 Accesses
7 Citations

Abstract

We present the tool ParCoSS for verification of cooperative multithreading programs. Our tool is based on the recently proposed Compiled Symbolic Simulation (CSS) technique. Additionally, we employ parallelization to further speed-up the verification. The potential of our tool is shown by evaluation.

This work was supported in part by the German Federal Ministry of Education and Research (BMBF) within the project EffektiV under contract no. 01IS13022E and by the German Research Foundation (DFG) within the Reinhart Koselleck project DR 287/23-1 and by the University of Bremen’s graduate school SyDe, funded by the German Excellence Initiative.

You have full access to this open access chapter, Download conference paper PDF

Lazy-CSeq-SP: Boosting Sequentialization-Based Verification of Multi-threaded C Programs via Symbolic Pruning of Redundant Schedules

GDart: An Ensemble of Tools for Dynamic Symbolic Execution on the Java Virtual Machine (Competition Contribution)

Local State Reusing for Efficient Model Checking of Multithreaded Programs

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

In this paper we propose our tool ParCoSS (Parallelized Compiled Symbolic Simulation) for verification of cooperative multithreading programs available in the Extended Intermediate Verification Language (XIVL) format. The XIVL extends the SystemC IVL [11, 15], which has been designed to capture the simulation semantics of SystemC programs [2, 10, 13], with a small core of OOP features to facilitate the translation of C++ code [16]. For verification purpose the XIVL supports computations with symbolic expressions and the assume and assert functions with their usual semantic. Our tool and set of XIVL examples is available at [1].

Verification of (cooperative) multithreading programs is difficult due to the large state space caused by all possible inputs and thread interleavings. Symbolic Simulation, a combination of symbolic execution [4, 14] and Partial Order Reduction (POR) [8, 9] has been shown to be particularly effective to tackle state explosion [5, 6, 15]. Recently Compiled Symbolic Simulation (CSS) has been proposed as further improvement [12]. CSS works by integrating the symbolic execution engine and POR based scheduler together with the multithreading program, e.g. available in the XIVL format, into a C++ program. Then, a standard C++ compiler is used to generate a native binary, whose execution performs exhaustive verification of the multithreading program. In contrast to traditional verification methods based on interpretation, CSS can provide significant simulation speed-ups especially by native execution of concrete operations.

The implementation of our tool ParCoSS is based on CSS and additionally supports parallelization to further improve simulation performance. Compared to the original CSS approach our tool uses a fork/join based state space exploration instead of manually cloning the execution states to handle non-deterministic choices due to symbolic branches and scheduling decisions. A fork/join based architecture most notably has the following advantages: (1) It allows to generate more efficient code. (2) It drastically simplifies the implementation.

In particular, we avoid the layer of indirection necessary for variable access when manually tracking execution states and use native execution for all function calls by employing coroutines. Besides very efficient context switch implementation, coroutines allow natural implementation of algorithms without unwinding the native stack and without using state machines to resume execution on context switches. Additionally, manual state cloning of complex internal data structures is error prone and difficult to implement efficiently, whereas the fork system call is already very mature and highly optimized. Finally, our architecture allows for straightforward and efficient parallelization by leveraging the process scheduling and memory sharing capabilities of the underlying operating system.

2 Extended Intermediate Verification Language (XIVL)

An example cooperative multithreading program illustrating the core features of the XIVL is shown in Fig. 1. The program is using two threads to compute the sum of odd numbers up to the bound specified by the variable x, which is initialized using a symbolic expression of type int in Line 2 and constrained in Line 28. The threads synchronize using the wait and notify functions on the global event e. The XIVL syntax resembles C++, supports integer and boolean data types with all arithmetic and logic operators, arrays and pointers, is using high-level control flow structures and has a small set of OOP features including classes, inheritance and virtual methods with overrides and dynamic dispatch.

3 Implementation Details

To simplify development, facilitate code re-use and the translation process from XIVL to C++ we have implemented the PCSS (Parallel CSS) library, which provides common building blocks for parallel symbolic simulation. The PCSS library is linked with the C++ program during compilation. An overview of our tool is shown in Fig. 2. In the following we will describe our PCSS library, provide more details on the fork/join based exploration and briefly sketch the translation process from XIVL to C++.

3.1 PCSS Library

The right hand side of Fig. 2 shows the main components, and their interaction, of the PCSS library. Essentially it consists of the following components: kernel, scheduler, SMT engine, fork engine and some process shared data.

The kernel provides a small set of functions which directly correspond to the XIVL kernel related primitives (e.g. wait and notify) and allows to simulate the SystemC event-driven simulation semantics. Furthermore, all thread functions of the XIVL model are registered in the kernel. The kernel will allocate a coroutine with every thread function as entry point. Coroutines naturally implement context switches as they allow to jump execution between arbitrarily nested functions while preserving the local data. Our implementation is using the lightweight boost context library and in particular the functions make_fcontext and jump_fcontext to create and switch execution between coroutines, respectively. The scheduler is responsible for selecting the next runnable thread inside the scheduling loop of the kernel. Our coroutine implementation allows to easily switch execution between the scheduling loop and the chosen thread. POR is employed to reduce the number of explored interleavings. The POR dependency relation is statically generated from the XIVL model and encoded into the C++ program during translation. At runtime it is passed to the scheduler during initialization.

The SMT engine provides common functionality required for symbolic execution. It keeps track of the current path condition, handles the assume and assert functions, and checks the feasibility of symbolic branch conditions. Furthermore, the SMT engine provides SMT types and operations. Essentially this is a lightweight layer around the underlying SMT solver and allows to transparently swap the employed SMT solver.

The fork engine is responsible to split the execution process into two independent processes in case of a non-deterministic choice. This happens when both branch directions are feasible in the SMT engine or multiple thread choices are still available in the scheduler after applying POR. One of the forked processes will continue exploration while the other is suspended until the first terminates. This approach simulates a depth first search (DFS) of the state space. As an optimization, the fork engine allows to run up to N processes in parallel, where N is a command line parameter to the compiled C++ program. Parallelization is very efficient as the processes explore disjoint state spaces independently.

3.2 Fork/Join Based State Space Exploration

Executing Symbolic Branches. The on_branch function in the SMT engine, shown in Fig. 3, accepts a symbolic branch condition and returns a concrete decision, which is then used to control native C++ control flow structures. The check_branch_status checks the feasibility of both branch directions by checking the satisfiability of the branch condition and its negation. In case both branch directions are feasible, the execution will fork (Line 4) into two independent processes and update the path condition (pc) with the normal (Line 6) or negated condition (Line 8), respectively. Please note that the execution is not forked and the path condition is not extended when only a single branch direction is feasible.

Parallelization. The forked processes communicate using anonymous shared memory, which is created during initialization in the first process using the mmap system call and thus accessible from all forked child processes. The shared memory essentially contains three information: (1) counter variable to ensure that no more than N processes will run in parallel, (2) shutdown flag to gracefully stop the simulation, e.g. when an assertion violation is detected, (3) unnamed semaphore to synchronize access. The semaphore is initialized and modified using the sem_init, sem_post and sem_wait functions. Furthermore, each process locally keeps track of the number of forked child processes (num_children). Figure 4 shows an implementation of the fork function. First the fork system call is executed. The child process (pid is zero) will never block, since executing only one process will not increase the number of active processes. The parent process however will first try to atomically check and increment the shared counter in Line 5. When this fails, i.e. the maximum number N of processes is already working, the parent process will wait until a working processes finishes, by either awaiting one of its own children (Line 7) or until some other process joins its children (Line 9).

3.3 XIVL to C++ Translation

We use the XIVL example from Fig. 1 to illustrate the XIVL to C++ translation process, which basically performs five steps: (1) Replace native data types (integer and boolean) and operations with SMT types and operations where necessary (here variable x). Variables which are never assigned a symbolic value (here variables i and sum) can keep their native type and perform native operations. (2) Instrument control flow code to query the SMT engine for a concrete decision. The branch condition \(x\,>\,\) 0 will be transformed into an SMT expression, e.g. as smt \(\rightarrow \) bv_gt (x, smt \(\rightarrow \) bv_val(0)), and wrapped by the on_branch function of the smt_engine. (3) Generate static POR information for the scheduler. Essentially, a static analysis is employed to detect read/write and notify/wait dependencies. A flow- and context-insensitive pointer analysis is used to increase the precision. (4) Redirect builtin XIVL functions to the SMT engine (assume and assert) and kernel instance (e.g. wait_event and notify). (5) Add a new main function that will initialize the PCSS library and call the main function of the XIVL model. It will initialize the global data of the XIVL model and then enter the scheduling loop of the kernel by calling the start_simulation function.

Table 1. Experiment results, T.O. denotes timeout (limit 750 s)

Full size table

4 Evaluation and Conclusion

We have evaluated our tool on a set of SystemC benchmarks from the literature [3, 7, 15] and the TLM model of the Interrupt Controller for Multiple Processors (IRQMP) of the LEON3-based virtual prototype SoCRocket [17]. All experiments have been performed on a Linux system using a 3.5 GHz Intel E3 Quadcore with Hyper-Threading. We used Clang 3.5 for compilation of the C++ programs and Z3 v4.4.1 in the SMT Layer. The time (memory) limit have been set to 750 s (6 GB), respectively. T.O. denotes that time limit has been exceeded. The results are shown in Table 1. It shows the simulation time in seconds for Kratos [7], Interpreted Symbolic Simulation (ISS) [6, 15], and our tool ParCoSS with a single process (P-1) and parallelized with four (P-4) and eight processes (P-8). Comparing P-4 with P-8 allows to observe the effect of Hyper-Threading. The results demonstrate the potential of our tool and show that our parallelization approach can further improve results. As expected, CSS can be considerably faster than ISS. On some benchmarks Kratos is faster due to its abstraction technique, but the difference is not significant. Furthermore, Kratos is not applicable to the irqmp benchmark due to missing C++ language features. For future work we plan to integrate dynamic information, for POR and selection of code blocks for native execution, into our CSS framework.

References

www.systemc-verification.org/ParCoSS
Accellera Systems Initiative. SystemC (2012). http://www.systemc.org
Blanc, N., Kroening, D.: Race analysis for SystemC using model checking. ACM Trans. Des. Autom. Electron. Syst. 15(3), 21:1–21:32 (2010)
Article Google Scholar
Cadar, C., Dunbar, D., Engler, D.: Klee: unassisted and automatic generation of high-coverage tests for complex systems programs. In: OSDI, pp. 209–224 (2008)
Google Scholar
Chou, C.-N., Chu, C.-K., Huang. C.-Y.R.: Conquering the scheduling alternative explosion problem of SystemC symbolic simulation. In: ICCAD, pp. 685–690 (2013)
Google Scholar
Chou, C.-N., Ho, Y.-S., Hsieh, C., Huang. C.-Y: Symbolic model checking on SystemC designs. In: DAC, pp. 327–333 (2012)
Google Scholar
Cimatti, A., Narasamdya, I., Roveri, M.: Software model checking SystemC. IEEE Trans. CAD Integr. Circuits Syst. 32(5), 774–787 (2013)
Article Google Scholar
Flanagan, C. Godefroid, P.: Dynamic partial-order reduction for model checking software. In: POPL, pp. 110–121 (2005)
Google Scholar
Godefroid, P. (ed.): Partial-Order Methods for the Verification of Concurrent Systems. LNCS, vol. 1032. Springer, Heidelberg (1996)
MATH Google Scholar
Große, D., Drechsler, R.: Quality-Driven SystemC Design. Springer, The Netherlands (2010)
Book Google Scholar
Herdt, V., Le, H.M., Drechsler, R.: Verifying SystemC using stateful symbolic simulation. In: DAC, pp. 49:1–49:6 (2015)
Google Scholar
Herdt, V., Le, H.M., Große, D., Drechsler. R.: Compiled symbolic simulation for SystemC In: ICCAD (2016)
Google Scholar
IEEE. IEEE Standard SystemC Language Reference Manual. IEEE Std. 1666 (2011)
Google Scholar
King, J.C.: Symbolic execution and program testing. Commun. ACM 19(7), 385–394 (1976)
Article MathSciNet MATH Google Scholar
Le, H.M., Große, D., Herdt, V., Drechsler, R.: Verifying SystemC using an intermediate verification language and symbolic simulation. In: DAC, pp. 116:1–116:6 (2013)
Google Scholar
Le, H.M., Herdt, V., Große, D., Drechsler. R.: Towards formal verification of real-world SystemC TLM peripheral models - a case study. In: DATE (2016)
Google Scholar
Schuster, T., Meyer, R., Buchty, R., Fossati, L., Berekovic. M.: SoCRocket - a virtual platform for the European Space Agency’s SoC development. In: ReCoSoC, pp. 1–7 (2014). http://github.com/socrocket

Download references

Author information

Authors and Affiliations

Group of Computer Architecture, University of Bremen, 28359, Bremen, Germany
Vladimir Herdt, Hoang M. Le, Daniel Große & Rolf Drechsler
Cyber-Physical Systems, DFKI GmbH, 28359, Bremen, Germany
Daniel Große & Rolf Drechsler

Authors

Vladimir Herdt
View author publications
You can also search for this author in PubMed Google Scholar
Hoang M. Le
View author publications
You can also search for this author in PubMed Google Scholar
Daniel Große
View author publications
You can also search for this author in PubMed Google Scholar
Rolf Drechsler
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Vladimir Herdt .

Editor information

Editors and Affiliations

Rice University , Houston, USA
Swarat Chaudhuri
University of Toronto , Toronto, Ontario, Canada
Azadeh Farzan

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Herdt, V., Le, H.M., Große, D., Drechsler, R. (2016). ParCoSS: Efficient Parallelized Compiled Symbolic Simulation. In: Chaudhuri, S., Farzan, A. (eds) Computer Aided Verification. CAV 2016. Lecture Notes in Computer Science(), vol 9780. Springer, Cham. https://doi.org/10.1007/978-3-319-41540-6_10

Download citation

DOI: https://doi.org/10.1007/978-3-319-41540-6_10
Published: 13 July 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-41539-0
Online ISBN: 978-3-319-41540-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics