The Marabou Framework for Verification and Analysis of Deep Neural Networks
Abstract
Deep neural networks are revolutionizing the way complex systems are designed. Consequently, there is a pressing need for tools and techniques for network analysis and certification. To help in addressing that need, we present Marabou, a framework for verifying deep neural networks. Marabou is an SMTbased tool that can answer queries about a network’s properties by transforming these queries into constraint satisfaction problems. It can accommodate networks with different activation functions and topologies, and it performs highlevel reasoning on the network that can curtail the search space and improve performance. It also supports parallel execution to further enhance scalability. Marabou accepts multiple input formats, including protocol buffer files generated by the popular TensorFlow framework for neural networks. We describe the system architecture and main components, evaluate the technique and discuss ongoing work.
1 Introduction
Recent years have brought about a major change in the way complex systems are being developed. Instead of spending long hours handcrafting complex software, many engineers now opt to use deep neural networks (DNNs) [6, 19]. DNNs are machine learning models, created by training algorithms that generalize from a finite set of examples to previously unseen inputs. Their performance can often surpass that of manually created software as demonstrated in fields such as image classification [16], speech recognition [8], and game playing [21].
Despite their overall success, the opacity of DNNs is a cause for concern, and there is an urgent need for certification procedures that can provide rigorous guarantees about network behavior. The formal methods community has taken initial steps in this direction, by developing algorithms and tools for neural network verification [5, 9, 10, 12, 18, 20, 23, 24]. A DNN verification query consists of two parts: (i) a neural network, and (ii) a property to be checked; and its result is either a formal guarantee that the network satisfies the property, or a concrete input for which the property is violated (a counterexample). A verification query can encode the fact, e.g., that a network is robust to small adversarial perturbations in its input [22].
A neural network is comprised of neurons, organized in layers. The network is evaluated by assigning values to the neurons in the input layer, and then using these values to iteratively compute the assignments of neurons in each succeeding layer. Finally, the values of neurons in the last layer are computed, and this is the network’s output. A neuron’s assignment is determined by computing a weighted sum of the assignments of neurons from the preceding layer, and then applying to the result a nonlinear activation function, such as the Rectified Linear Unit (ReLU) function, \(\text {ReLU} {}{}(x)=\max {}(0,x)\). Thus, a network can be regarded as a set of linear constraints (the weighted sums), and a set of nonlinear constraints (the activation functions). In addition to a neural network, a verification query includes a property to be checked, which is given in the form of linear or nonlinear constraints on the network’s inputs and outputs. The verification problem thus reduces to finding an assignment of neuron values that satisfies all the constraints simultaneously, or determining that no such assignment exists.
This paper presents a new tool for DNN verification and analysis, called Marabou. The Marabou project builds upon our previous work on the Reluplex project [2, 7, 12, 13, 15, 17], which focused on applying SMTbased techniques to the verification of DNNs. Marabou follows the Reluplex spirit in that it applies an SMTbased, lazy search technique: it iteratively searches for an assignment that satisfies all given constraints, but treats the nonlinear constraints lazily in the hope that many of them will prove irrelevant to the property under consideration, and will not need to be addressed at all. In addition to search, Marabou performs deduction aimed at learning new facts about the nonlinear constraints in order to simplify them.

Native support for fully connected and convolutional DNNs with arbitrary piecewiselinear activation functions. This extends the Reluplex algorithm, which was originally designed to support only ReLU activation functions.

Builtin support for a divideandconquer solving mode, in which the solver is run with an initial (small) timeout. If the timeout is reached, the solver partitions its input query into simpler subqueries, increases the timeout value, and repeats the process on each subquery. This mode naturally lends itself to parallel execution by running subqueries on separate nodes; however, it can yield significant speedups even when used with a single node.

A complete simplexbased linear programming core that replaces the external solver (GLPK) that was previously used in Reluplex. The new simplex core was tailored for a smooth integration with the Marabou framework and eliminates much of the overhead in Reluplex due to the use of GLPK.

Multiple interfaces for feeding queries into the solver. A query’s neural network can be provided in a textual format or as a protocol buffer (protobuf) file containing a TensorFlow model; and the property can be either compiled into the solver, provided in Python, or stored in a textual format. We expect these interfaces will simplify usage of the tool for many users.

Support for networklevel reasoning and deduction. The earlier Reluplex tool performed deductions at the level of single constraints, ignoring the input network’s topology. In Marabou, we retain this functionality but also include support for reasoning based on the network topology, such as symbolic bound tightening [23]. This allows for efficient curtailment of the search space.
2 Design of Marabou
Marabou regards each neuron in the network as a variable and searches for a variable assignment that simultaneously satisfies the query’s linear constraints and nonlinear constraints. At any given point, Marabou maintains the current variable assignment, lower and upper bounds for every variable, and the set of current constraints. In each iteration, it then changes the variable assignment in order to (1) correct a violated linear constraint, or (2) correct a violated nonlinear constraint.
The Marabou verification procedure is sound and complete, i.e. the aforementioned loop eventually terminates. This can be shown via a straightforward extension of the soundness and completeness proof for Reluplex [12]. However, in order to guarantee termination, Marabou only supports activation functions that are piecewiselinear. The tool already has builtin support for the ReLU function and the Max function \(\max {}(x_1,\ldots ,x_n)\), and it is modular in the sense that additional piecewiselinear functions can be added easily.
Another important aspect of Marabou’s verification strategy is deduction—specifically, the derivation of tighter lower and upper variable bounds. The motivation is that such bounds may transform piecewiselinear constraints into linear constraints, by restricting them to one of their linear segments. To achieve this, Marabou repeatedly examines linear and nonlinear constraints, and also performs networklevel reasoning, with the goal of discovering tighter variable bounds.
Next, we describe Marabou’s main components (see also Fig. 1).
2.1 Simplex Core (Tableau and BasisFactorization Classes)
The simplex core is the part of the system responsible for making the variable assignment satisfy the linear constraints. It does so by implementing a variant of the simplex algorithm [3]. In each iteration, it changes the assignment of some variable x, and consequently the assignment of any variable y that is connected to x by a linear equation. Selecting x and determining its new assignment is performed using standard algorithms—specifically, the revised simplex method in which the various linear constraints are kept in implicit matrix form, and the steepestedge and Harris’ ratio test strategies for variable selection.
Creating an efficient simplex solver is complicated. In Reluplex, we delegated the linear constraints to an external solver, GLPK. Our motivation for implementing a new custom solver in Marabou was twofold: first, we observed in Reluplex that the repeated translation of queries into GLPK and extraction of results from GLPK was a limiting factor on performance; and second, a black box simplex solver did not afford the flexibility we needed in the context of DNN verification. For example, in a standard simplex solver, variable assignments are typically pressed against their upper or lower bounds, whereas in the context of a DNN, other assignments might be needed to satisfy the nonlinear constraints. Another example is the deduction capability, which is crucial for efficiently verifying a DNN and whose effectiveness might depend on the internal state of the simplex solver.
2.2 PiecewiseLinear Constraints (PiecewiseLinearConstraint Class)
Throughout its execution, Marabou maintains a set of piecewiselinear constraints that represent the DNN’s nonlinear functions. In iterations devoted to satisfying these constraints, Marabou looks for any constraints that are not satisfied by the current assignment. If such a constraint is found, Marabou changes the assignment in a way that makes that constraint satisfied. Alternatively, in order to guarantee eventual termination, if Marabou detects that a certain constraint is repeatedly not satisfied, it may perform a casesplit on that constraint: a process in which the piecewiselinear constraint \(\varphi \) is replaced by an equivalent disjunction of linear constraints \(c_1\vee \ldots \vee c_n\). Marabou considers these disjuncts one at a time and checks for satisfiability. If the problem is satisfiable when \(\varphi \) is replaced by some \(c_i\), then the original problem is also satisfiable; otherwise, the original problem is unsatisfiable.

satisfied(): the constraint object needs to answer whether or not it is satisfied given the current assignment. For example, for a constraint \(y=\text {ReLU} {}{}(x)\) and assignment \(x=y=3\), satisfied() would return true; whereas for assignment \(x=5, y=3\), it would return false.

getPossibleFixes(): if the constraint is not satisfied by the current assignment, this method returns possible changes to the assignment that would correct the violation. For example, for \(x=5, y=3\), the ReLU constraint from before might propose two possible changes to the assignment, \(x\leftarrow 3\) or \(y\leftarrow 0\), as either would satisfy \(y=\text {ReLU} {}{}(x)\).

getCaseSplits(): this method asks the piecewiselinear constraint \(\varphi \) to return a list of linear constraints \(c_1,\ldots ,c_n\), such that \(\varphi \) is equivalent to \(c_1\vee \ldots \vee c_n\). For example, when invoked for a constraint \(y=\max {}(x_1, x_2)\), getCaseSplits() would return the linear constraints \(c_1:(y=x_1\wedge x_1\ge x_2)\) and \(c_2:(y=x_2\wedge x_2\ge x_1)\). These constraints satisfy the requirement that the original constraint is equivalent to \(c_1\vee c_2\).

getEntailedTightenings(): as part of Marabou’s deduction of tighter variable bounds, piecewiselinear constraints are repeatedly informed of changes to the lower and upper bounds of variables they affect. Invoking getEntailedTightenings() queries the constraint for tighter variable bounds, based on current information. For example, suppose a constraint \(y=\text {ReLU} {}{}(x)\) is informed of the upper bounds \(x\le 5\) and \(y\le 7\); in this case, getEntailedTightenings() would return the tighter bound \(y\le 5\).
2.3 Constraint and NetworkLevel Reasoning (RowBoundTightener, ConstraintBoundTightener and SymbolicBoundTightener Classes)
Effective deduction of tighter variable bounds is crucial for Marabou’s performance. Deduction is performed at the constraint level, by repeatedly examining linear and piecewiselinear constraints to see if they imply tighter variable bounds; and also at the DNNlevel, by leveraging the network’s topology.
Constraintlevel bound tightening is performed by querying the piecewiselinear constraints for tighter bounds using the getEntailedTightenings() method. Similarly, linear equations can also be used to deduce tighter bounds. For example, the equation \(x=y+z\) and lower bounds \(x\ge 0\), \(y\ge 1\) and \(z\ge 1\) together imply the tighter bound \(x\ge 2\). As part of the simplexbased search, Marabou repeatedly encounters many linear equations and uses them for bound tightening.
Several recent papers have proposed verification schemes that rely on DNNlevel reasoning [5, 23]. Marabou supports this kind of reasoning as well, by storing the initial network topology and performing deduction steps that use this information as part of its iterative search. DNNlevel reasoning is seamlessly integrated into the search procedure by (1) initializing the DNNlevel reasoners with the most uptodate information discovered during the search, such as variable bounds and the state of piecewiselinear constraints; and (2) feeding any new information that is discovered back into the search procedure. Presently Marabou implements a symbolic bound tightening procedure [23]: based on network topology, upper and lower bounds for each hidden neuron are expressed as a linear combination of the input neurons. Then, if the bounds on the input neurons are sufficiently tight (e.g., as a result of past deductions), these expressions for upper and lower bounds may imply that some of the hidden neurons’ piecewiselinear activation functions are now restricted to one of their linear segments. Implementing additional DNNlevel reasoning operations is work in progress.
2.4 The Engine (Engine and SmtCore Classes)
 1.
If a piecewiselinear constraint had to be fixed more than a certain number of times, perform a case split on that constraint.
 2.
If the problem has become unsatisfiable, e.g. because for some variable a lower bound has been deduced that is greater than its upper bound, undo a previous case split (or return UNSAT if no such case split exists).
 3.
If there is a violated linear constraint, perform a simplex step.
 4.
If there is a violated piecewiselinear constraint, attempt to fix it.
 5.
Return SAT (all constraints are satisfied).
The engine also triggers deduction steps, both at the neuron level and at the network level, according to various heuristics.
2.5 The DivideandConquer Mode and Concurrency (DnC.py)
Marabou supports a divideandconquer (D&C) solving mode, in which the input region specified in the original query is partitioned into subregions. The desired property is checked on these subregions independently. The D&C mode naturally lends itself to parallel execution, by having each subquery checked on a separate node. Moreover, the D&C mode can improve Marabou’s overall performance even when running sequentially: the total time of solving the subqueries is often less than the time of solving the original query, as the smaller input regions allow for more effective deduction steps.
 1.
Pop a pair \(\langle \phi ', t'\rangle \) from Q and attempt to solve \(\phi '\) with a timeout of \(t'\).
 2.
If the problem is UNSAT and Q is empty, return UNSAT.
 3.
If the problem is UNSAT and Q is not empty, return to step 1.
 4.
If the problem is SAT, return SAT.
 5.
If a timeout occurred, split \(\phi '\) into k subqueries \(\phi _1',\ldots , \phi _k'\) by partitioning its input region. For each subquery \(\phi _i'\), push \(\langle \phi _i', m\cdot t'\rangle \) into Q.
The timeout factor m and the splitting factor k are configurable parameters. Splitting the query’s input region is performed heuristically.
2.6 Input Interfaces (AcasParser class, maraboupy Folder)

Native Marabou format: a user prepares a query using the Marabou C++ interface, compiles the query into the tool, and runs it. This format is useful for integrating Marabou into a larger framework.

Marabou executable: a user runs a Marabou executable, and passes to it commandline parameters indicating the network and property files to be checked. Currently, network files are encoded using the NNet format [11], and the properties are given in a simple textual format.

Python/TensorFlow interface: the query is passed to Marabou through Python constructs. The python interface can also handle DNNs stored as TensorFlow protobuf files.
3 Evaluation
For our evaluation we used the ACAS Xu [12], CollisionDetection [4] and TwinStream [1] families of benchmarks. Toolwise, we considered the Reluplex tool which is the most closely related to Marabou, and also ReluVal [23] and Planet [4]. The version of Marabou used for the evaluation is available online [14].
4 Conclusion
DNN analysis is an emerging field, and Marabou is a step towards a more mature, stable verification platform. Moving forward, we plan to improve Marabou in several dimensions. Part of our motivation in implementing a custom simplex solver was to obtain the needed flexibility for fusing together the solving process for linear and nonlinear constraints. Currently, this flexibility has not been leveraged much, as these pieces are solved relatively separately. We expect that by tackling both kinds of constraints simultaneously, we will be able to improve performance significantly. Other enhancements we wish to add include: additional networklevel reasoning techniques based on abstract interpretation; better heuristics for both the linear and nonlinear constraint solving engines; and additional engineering improvements, specifically within the simplex engine.
Notes
Acknowledgements
We thank Elazar Cohen, Justin Gottschlich, and Lindsey Kuper for their contributions to this project. The project was partially supported by grants from the Binational Science Foundation (2017662), the Defense Advanced Research Projects Agency (FA875018C0099), the Federal Aviation Administration, Ford Motor Company, Intel Corporation, the Israel Science Foundation (683/18), the National Science Foundation (1814369, DGE1656518), Siemens Corporation, and the Stanford CURIS program.
References
 1.Bunel, R., Turkaslan, I., Torr, P., Kohli, P., Kumar, M.: Piecewise linear neural network verification: a comparative study. Technical report (2017). arXiv:1711.00455v1
 2.Carlini, N., Katz, G., Barrett, C., Dill, D.: Provably minimallydistorted adversarial examples. Technical report (2017). arXiv:1709.10207
 3.Chvátal, V.: Linear Programming. W. H. Freeman and Company, New York (1983)zbMATHGoogle Scholar
 4.Ehlers, R.: Formal verification of piecewise linear feedforward neural networks. In: D’Souza, D., Narayan Kumar, K. (eds.) ATVA 2017. LNCS, vol. 10482, pp. 269–286. Springer, Cham (2017). https://doi.org/10.1007/9783319681672_19CrossRefGoogle Scholar
 5.Gehr, T., Mirman, M., DrachslerCohen, D., Tsankov, E., Chaudhuri, S., Vechev, M.: AI2: safety and robustness certification of neural networks with abstract interpretation. In: Proceedings of 39th IEEE Symposium on Security and Privacy (S&P) (2018)Google Scholar
 6.Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press, Cambridge (2016)zbMATHGoogle Scholar
 7.Gopinath, D., Katz, G., Pǎsǎreanu, C., Barrett, C.: DeepSafe: a datadriven approach for checking adversarial robustness in neural networks. In: Proceedings of 16th International Symposium on on Automated Technology for Verification and Analysis (ATVA), pp. 3–19 (2018)Google Scholar
 8.Hinton, G., et al.: Deep neural networks for acoustic modeling in speech recognition: the shared views of four research groups. IEEE Signal Process. Mag. 29(6), 82–97 (2012)CrossRefGoogle Scholar
 9.Huang, X., Kwiatkowska, M., Wang, S., Wu, M.: Safety verification of deep neural networks. In: Proceedings of 29th International Conference on Computer Aided Verification (CAV), pp. 3–29 (2017)Google Scholar
 10.Hull, J., Ward, D., Zakrzewski, R.: Verification and validation of neural networks for safetycritical applications. In: Proceedings of 21st American Control Conference (ACC) (2002)Google Scholar
 11.Julian, K.: NNet Format (2018). https://github.com/sisl/NNet
 12.Katz, G., Barrett, C., Dill, D.L., Julian, K., Kochenderfer, M.J.: Reluplex: an efficient SMT solver for verifying deep neural networks. In: Majumdar, R., Kunčak, V. (eds.) CAV 2017. LNCS, vol. 10426, pp. 97–117. Springer, Cham (2017). https://doi.org/10.1007/9783319633879_5CrossRefGoogle Scholar
 13.Katz, G., Barrett, C., Dill, D., Julian, K., Kochenderfer, M.: Towards proving the adversarial robustness of deep neural networks. In: Proceedings of 1st Workshop on Formal Verification of Autonomous Vehicles (FVAV), pp. 19–26 (2017)Google Scholar
 14.Katz, G., et al.: Marabou (2019). https://github.com/guykatzz/Marabou/tree/cav_artifact
 15.Kazak, Y., Barrett, C., Katz, G., Schapira, M.: Verifying deepRLdriven systems. In: Proceedings of 1st ACM SIGCOMM Workshop on Network Meets AI & ML (NetAI) (2019)Google Scholar
 16.Krizhevsky, A., Sutskever, I., Hinton, G.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)Google Scholar
 17.Kuper, L., Katz, G., Gottschlich, J., Julian, K., Barrett, C., Kochenderfer, M.: Toward scalable verification for safetycritical deep networks. Technical report (2018). arXiv:1801.05950
 18.Pulina, L., Tacchella, A.: An abstractionrefinement approach to verification of artificial neural networks. In: Touili, T., Cook, B., Jackson, P. (eds.) CAV 2010. LNCS, vol. 6174, pp. 243–257. Springer, Heidelberg (2010). https://doi.org/10.1007/9783642142956_24CrossRefGoogle Scholar
 19.Riesenhuber, M., Tomaso, P.: Hierarchical models of object recognition in cortex. Nat. Neurosci. 2(11), 1019–1025 (1999)CrossRefGoogle Scholar
 20.Ruan, W., Huang, X., Kwiatkowska, M.: Reachability analysis of deep neural networks with provable guarantees. In: Proceedings of 27th International Joint Conference on Artificial Intelligence (IJCAI) (2018)Google Scholar
 21.Silver, D., et al.: Mastering the game of go with deep neural networks and tree search. Nature 529(7587), 484–489 (2016)CrossRefGoogle Scholar
 22.Szegedy, C., et al.: Intriguing properties of neural networks. Technical report (2013). arXiv:1312.6199
 23.Wang, S., Pei, K., Whitehouse, J., Yang, J., Jana, S.: Formal security analysis of neural networks using symbolic intervals. Technical report (2018). arXiv:1804.10829
 24.Xiang, W., Tran, H., Johnson, T.: Output reachable set estimation and verification for multilayer neural networks. IEEE Trans. Neural Netw. Learn. Syst. (TNNLS) 99, 1–7 (2018)Google Scholar
Copyright information
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.