Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

figure a

1 Introduction

Pointers constitute an essential concept in modern programming languages, and are used for implementing dynamic data structures like lists, trees etc. However, many software bugs can be traced back to the erroneous use of pointers by e.g. dereferencing null pointers or accidentally pointing to wrong parts of the heap. Due to the resulting unbounded state spaces, pointer errors are hard to detect. Automated tool support for validation of pointer programs that provides meaningful debugging information in case of violations is therefore highly desirable.

Attestor is a verification tool that attempts to achieve both of these goals. To this aim, it first constructs an abstract state space of the input program by means of symbolic execution. Each state depicts both links between heap objects and values of program variables using a graph representation. Abstraction is performed on state level by means of graph grammars. They specify the data structures maintained by the program, and describe how to summarise substructures of the heap in order to obtain a finite representation. After labelling each state with propositions that provide information about structural properties such as reachability or heap shapes, the actual verification task is performed in a second step. To this aim, the abstract state space is checked against a user-defined LTL specification. In case of violations, a counterexample is provided.

In summary, Attestor’s main features can be characterized as follows:

  • It employs context-free graph grammars as a formal underpinning for defining heap abstractions. These grammars enable local heap concretisation and thus naturally provide implicit abstract semantics.

  • The full instruction set of Java Bytecode is handled. Program actions that are outside the scope of our analysis, such as arithmetic operations or Boolean tests on payload data, are handled by (safe) over-approximation.

  • Specifications are given by linear-time temporal logic (LTL) formulae which support a rich set of program properties, ranging from memory safety over shape, reachability or balancedness to properties such as full traversal or preservation of the exact heap structure.

  • Except for expecting a graph grammar that specifies the data structures handled by a program, the analysis is fully automated. In particular, no program annotations are required.

  • Modular reasoning is supported in the form of contracts that summarise the effect of executing a (recursive) procedure. These contracts can be automatically derived or manually specified.

  • Valuable feedback is provided through a comprehensive report including (minimal) non-spurious counterexamples in case of property violations.

  • The tool’s functionality is made accessible through the command line as well as a graphical user and an application programming interface.

Availability. Attestor’s source code, benchmarks, and documentation are available online at https://moves-rwth.github.io/attestor.

2 The Attestor Tool

Attestor is implemented in Java and consists of about 20.000 LOC (excluding comments and tests). An architectural overview is depicted in Fig. 1. It shows the tool inputs (left), its outputs (right), the Attestor backend with its processing phases (middle), the Attestor frontend (below) as well as the API connecting back- and frontend. These elements are discussed in detail below.

Fig. 1.
figure 1

The Attestor tool

2.1 Input

As shown in Fig. 1 (left), a verification task is given by four inputs. First, the program to be analysed. Here, Java as well as Java Bytecode programs with possibly recursive procedures are supported, where the former is translated to the latter prior to the analysis. Second, the specification has to be given by a set of LTL formulae enriched with heap-specific propositions. See Sect. 3 for a representative list of exemplary specifications.

As a third input, Attestor expects the declaration of the graph grammar that guides the abstraction. In order to obtain a finite abstract state space, this grammar is supposed to cover the data structures emerging during program execution. The user may choose from a set of grammar definitions for standard data structures such as singly- and doubly-linked lists and binary trees, the manual specification in a JSON-style graph format and combinations thereof.

Fourth, additional options can be given that e.g. define the initial heap configuration(s) (in JSON-style graph format), that control the granularity of abstraction and the garbage collection behaviour, or that allow to re-use results of previous analyses in the form of procedure contracts [11, 13].

2.2 Phases

Attestor proceeds in six main phases, see Fig. 1 (middle). In the first and third phase, all inputs are parsed and preprocessed. The input program is read and transformed to Bytecode (if necessary), the input graphs (initial configuration, procedure contracts, and graph grammar), LTL formulae and further options are read.

Depending on the provided LTL formulae, additional markings are inserted into the initial heap (see [8] for details) in the second phase. They are used to track identities of objects during program execution, which is later required to validate visit and neighbourhood properties during the fifth phase.

In the next phase the actual program analysis is conducted. To this aim, Attestor first constructs the abstract state space as described in Sect. 2.3 in detail. In the fifth phase we check whether the provided LTL specification holds on the state space resulting from the preceding step. We use an off-the-shelf tableau-based LTL model checking algorithm [2].

If desired, during all phases results are forwarded to the API to make them accessible to the frontend or the user directly. We address this output in Sect. 2.4.

2.3 Abstract State Space Generation

The core module of Attestor is the abstract state space generation. It employs an abstraction approach based on hyperedge replacement grammars, whose theoretical underpinnings are described in [9] in detail. It is centred around a graph-based representation of the heap that contains concrete parts side by side with placeholders representing a set of heap fragments of a certain shape. The state space generation loop as implemented in Attestor is shown in Fig. 2.

Fig. 2.
figure 2

State space generation.

Initially it is provided with the initial program state(s), that is, the program counter corresponding to the starting statement together with the initial heap configuration(s). From these, Attestor picks a state at random and applies the abstract semantics of the next statement: First, the heap configuration is locally concretised ensuring that all heap parts required for the statement to execute are accessible. This is enabled by applying rules of the input graph grammar in forward direction, which can entail branching in the state space. The resulting configurations are then manipulated according to the concrete semantics of the statement. At this stage, Attestor automatically detects possible null pointer dereferencing operations as a byproduct of the state space generation. In a subsequent rectification step, the heap configuration is cleared from e.g. dead variables and garbage (if desired). Consequently, memory leaks are detected immediately. The rectified configuration is then abstracted with respect to the data structures specified by means of the input graph grammar. Complementary to concretisation, this is realised by applying grammar rules in backward direction, which involves a check for embeddings of right-hand sides. A particular strength of our approach is its robustness against local violations of data structures, as it simply leaves the corresponding heap parts concrete. Finalising the abstract execution step, the resulting state is labelled with the atomic propositions it satisfies. This check is efficiently implemented by means of heap automata (see [12, 15] for details). By performing a subsumption check on the state level, Attestor detects whether the newly generated state is already covered by a more abstract one that has been visited before. If not, it adds the resulting state to the state space and starts over by picking a new state. Otherwise, it checks whether further states have to be processed or whether a fixpoint in the state space generation is reached. In the latter case, this phase is terminated.

2.4 Output

As shown in Fig. 1 (right), we obtain three main outputs once the analysis is completed: the computed abstract state space, the derived procedure contracts, and the model checking results. For each LTL formula in the specification, results comprise the possible answers “formula satisfied”, “formula (definitely) not satisfied”, or “formula possibly not satisfied”. In case of the latter two, Attestor additionally produces a counterexample, i.e. an abstract trace that violates the formula. If Attestor was able to verify the non-spuriousness of this counterexample (second case), we are additionally given a concrete initial heap that is accountable for the violation and that can be used as a test case for debugging.

Besides the main outputs, Attestor provides general information about the current analysis. These include log messages such as warnings and errors, but also details about settings and runtimes of the analyses. The API provides the interface to retrieve Attestor’s outputs as JSON-formatted data.

Fig. 3.
figure 3

Screenshot of Attestor’s frontend for state space exploration. (Color figure online)

2.5 Frontend

Attestor features a graphical frontend that visualises inputs as well as results of all benchmark runs. The frontend communicates with Attestor’s backend via the API only. It especially can be used to display and navigate through the generated abstract state space and counterexample traces.

A screenshot of the frontend for state space exploration is found in Fig. 3. The left panel is an excerpt of the state space. The right panel depicts the currently selected state, where red boxes correspond to variables and constants, circles correspond to allocated objects/locations, and yellow boxes correspond to nonterminals of the employed graph grammar, respectively. Arrows between two circles represent pointers. Further information about the selected state is provided in the topmost panel. Graphs are rendered using cytoscape.js [6].

3 Evaluation

Tool Comparison. While there exists a plethora of tools for analysing pointer programs, such as, amongst others, Forester [10], Groove [7], Infer [5], Hip/Sleek [17], Korat [16], Juggrnaut [9], and Tvla [3], these tools differ in multiple dimensions:

  • Input languages range from C code (Forester, Infer, Hip/Sleek) over Java/Java Bytecode (Juggrnaut, Korat) to assembly code (Tvla) and graph programs (Groove).

  • The degree of automation differs heavily: Tools like Forester and Infer only require source code. Others such as Hip/Sleek and Juggrnaut additionally expect general data structure specifications in the form of e.g. graph grammars or predicate definitions to guide the abstraction. Moreover, Tvla requires additional program-dependent instrumentation predicates.

  • Verifiable properties typically cover memory safety. Korat is an exception, because it applies test case generation instead of verification. The tools Hip/Sleek, Tvla, Groove, and Juggrnaut are additionally capable of verifying data structure invariants, so-called shape properties. Furthermore, Hip/Sleek is able to reason about shape-numeric properties, e.g. lengths of lists, if a suitable specification is provided. While these properties are not supported by Tvla, it is possible to verify reachability properties. Moreover, Juggrnaut can reason about temporal properties such as verifying that finally every element of an input data structure has been accessed.

Benchmarks. Due to the above mentioned diversity there is no publicly available and representative set of standardised benchmarks to compare the aforementioned tools [1]. We thus evaluated Attestor on a collection of challenging, pointer intensive algorithms compiled from the literature [3, 4, 10, 14]. To assess our counterexample generation, we considered invalid specifications, e.g. that a reversed list is the same list as the input list. Furthermore, we injected faults into our examples by swapping and deleting statements.

Properties. During state space generation, memory safety (M) is checked. Moreover, we consider five classes of properties that are verified using the built-in LTL model checker:

  • The shape property (S) establishes that the heap is of a specific shape, e.g. a doubly-linked list or a balanced tree.

  • The reachability property (R) checks whether some variable is reachable from another one via specific pointer fields.

  • The visit property (V) verifies whether every element of the input is accessed by a specific variable.

  • The neighbourhood property (N) checks whether the input data structure coincides with the output data structure upon termination.

  • Finally, we consider other functional correctness properties (C), e.g. the return value is not null.

Table 1. The experimental results. All runtimes are in seconds. Verification time includes state space generation. SLL (DLL) means singly-linked (doubly-linked) list.

Setup. For performance evaluation, we conducted experiments on an Intel Core i7-7500U CPU @ 2.70 GHz with the Java virtual machine (OpenJDK version 1.8.0_151) limited to its default setting of 2 GB of RAM. All experiments were run using the Java benchmarking harness jmh. Our experimental results are shown in Table 1. Additionally, for comparison purpose we considered Java implementations of benchmarks that have been previously analysed for memory safety by Forester [10], see Table 2.

Discussion. The results show that both memory safety (M) and shape (S) are efficiently processed, with regard to both state space size and runtime. This is not surprising as these properties are directly handled by the state space generation engine. The most challenging tasks are the visit (V) and neighbourhood (N) properties as they require to track objects across program executions by means of markings. The latter have a similar impact as pointer variables: increasing their number impedes abstraction as larger parts of the heap have to be kept concrete. This effect can be observed for the Lindstrom tree traversal procedure where adding one marking (V) and three markings (N) both increase the verification effort by an order of magnitude.

Table 2. Forester benchmarks (memory safety only). Verification times are in seconds.