1 Introduction

A set of formulas is said to be unsatisfiable if the conjunction of its members has no model (is unsatisfiable). A minimal unsatisfiable set (a mus) has the additional property that every proper subset of it is satisfiable.

Given an unsatisfiable set \({\mathcal {F}}\) the task of computing a mus contained in \({\mathcal {F}}\) (a mus of \({\mathcal {F}}\)) has long been an important problem for a range of verification applications related to diagnosis and debugging, e.g., program type debugging, circuit diagnosis, production configuration (see [6]).

muses have become even more important with the increasing applications of sat based approaches in system analysis and verification. In [23] a number of ways that muses can be used in sat based bounded model checking (bmc) are presented. For example, a mus might tell the user that the property being checked did not play a role in deriving unsat, thus indicating that the system specification is unconstrained. muses also play an important role in applications that exploit unsatisfiable sets (sometimes called unsatisfiable cores). As discussed in [6] many of these application can benefit significantly from computing muses rather than just using the default unsatisfiable core returned by the solver. Formal equivalence checking, proof-based abstraction refinement, and boolean function bi-decomposition are three important applications in which computing a mus has proved to be beneficial [6]. Belov et al. [4] present some more recent results quantifying the benefits of computing muses in the hybrid counterexample/proof-based abstraction engine gla implemented in the abc verification tool [9]. A more recent application of muses arises in the FrankenBit verifier [12] where muses are used to compute invariants [13].

With this range of applications it is not surprising that there has been an extensive amount of research into developing more effective algorithms for computing muses, e.g., [5, 6, 11, 14, 1820] (see [6] for a more extensive list).

In this paper we continue this line of research and present new algorithms for computing muses. Our new algorithms exploit the well known hitting set duality between the unsatisfiable subsets of an unsatisfiable set \({\mathcal {F}}\) and the correction subsets of \({\mathcal {F}}\). Our algorithms work in particular with minimal subsets—the duality also holds between minimal unsatisfiable subsets and minimal correction subsets. This duality has been exploited before to compute all muses in the camus system [16]. However, in camus the first step was to compute the set of all \({\textsc {mcses}} \), AllMcses, from which all \({\textsc {muses}} \) can be extracted by finding all minimal hitting sets of AllMcses. Unfortunately in practice it is often impossible to complete the first step of computing AllMcses.

We find new ways to exploit the \({\textsc {mus}} \)/\({\textsc {mcs}} \) connection in order to compute a single \({\textsc {mus}} \) and to incrementally compute all muses. Our method does not require computing AllMcses. We show empirically that our new algorithms advance the state of the art in mus computation, and thus can potentially enhance a range of applications in formal methods that rely on computing muses.

2 Background

Let \({\mathbb {T}}\) be some background theory and \({\mathcal {F}} \) be a set of \({\mathbb {T}}\)-formulas such that the conjunction of these formulas is \({\mathbb {T}}\)-unsat, i.e., their conjunction has no \({\mathbb {T}}\)-model. In many applications it is useful to identify a smaller subset of \({\mathcal {F}} \) that is \({\mathbb {T}}\)-unsat. In practice, if the \({\mathbb {T}}\)-sat status of various subsets of \({\mathcal {F}}\) can be effectively determined, then finding a minimal subset of \({\mathcal {F}} \) that is \({\mathbb {T}}\)-unsat is often achievable.

In this paper we will always regard a set of formulas \({\mathcal {F}} \) as representing the conjunction of its members. So, e.g., \({\mathcal {F}} \) is \({\mathbb {T}}\)-unsat means \(\bigwedge _{f\in {\mathcal {F}}} f\) is \({\mathbb {T}}\)-unsat.

Definition 1

(MUS). An unsatisfiable subset U of \({\mathcal {F}}\) is a subset of \({\mathcal {F}}\) that is \({\mathbb {T}}\)-unsat. A Minimal Unsatisfiable Set (mus) of \({\mathcal {F}} \) is a unsatisfiable subset \(M\subseteq {\mathcal {F}} \) that is minimal w.r.t. set inclusion. That is, M is \({\mathbb {T}}\)-unsat and every proper subset \(S\subsetneq M\), S is \({\mathbb {T}}\)-sat.

Definition 2

(MSS). A satisfiable subset of \({\mathcal {F}}\) is a subset of \({\mathcal {F}}\) that is \({\mathbb {T}}\)-sat. A Maximal Satisfiable Subset (mss) of \({\mathcal {F}} \) is a satisfiable subset \(S\subseteq {\mathcal {F}} \) that is maximal w.r.t set inclusion. That is, S is \({\mathbb {T}}\)-sat and for every proper superset \(U\supsetneq S\) such that \(U \subseteq {\mathcal {F}} \), U is \({\mathbb {T}}\)-unsat.

Definition 3

(MCS). A correction subset of \({\mathcal {F}}\) is a subset of \({\mathcal {F}}\) whose complement in \({\mathcal {F}}\) is \({\mathbb {T}}\)-sat. A Minimal Correction Subset (mcs) of \({\mathcal {F}} \) is a correction subset \(C\subseteq {\mathcal {F}} \) that is minimal w.r.t. set inclusion, i.e., \({\mathcal {F}} \setminus C\) is an mss of \({\mathcal {F}} \).

Definition 4

A formula \(f \in {\mathcal {F}} \) is said to be critical (or a transition formula [7]) for \({\mathcal {F}} \) when \({\mathcal {F}} \) is \({\mathbb {T}}\)-unsat and \({\mathcal {F}}-\{f\}\) is \({\mathbb {T}}\)-sat.

Intuitively, a mus is an unsatisfiable set that cannot be reduced without causing it to become satisfiable; a mss is a satisfiable set that cannot be added to without causing it to become unsatisfiable; and an mcs is a minimal set of removals from \({\mathcal {F}} \) that causes \({\mathcal {F}} \) to become satisfiable.

A critical formula for \({\mathcal {F}} \) is one whose removal from \({\mathcal {F}} \) causes \({\mathcal {F}} \) to become satisfiable. It should be noted if f is critical for \({\mathcal {F}} \) then (a) f must be contained in every mus of \({\mathcal {F}} \) and (b) \(\{f\}\) is an mcs of \({\mathcal {F}} \). Furthermore, it can be observed that M is a mus if and only if every \(f\in M\) is critical for M. Note that a formula f that is critical for a set S is not necessarily critical for a superset \(S'\supset S\). In particular, \(S'\) might contain other muses that do not contain f.

Duality. There is a well known hitting set duality between muses and mcses that to the best of our knowledge was first presented formally by Reiter [22] in the context of diagnosis problems.

A hitting set H of a collection of sets \({\mathcal {C}} \) is a set that “hits” every set in \({\mathcal {C}} \) in the sense that it has a non empty intersection with each such set: \(\forall C\in {\mathcal {C}}. H\cap C \ne \emptyset \). A hitting set H is minimal (or irreducible) if no subset of H is a hitting set.

Let \({\textit{AllMuses}} ({\mathcal {F}})\) (\({\textit{AllMcses}} ({\mathcal {F}})\)) be the set containing all muses (mcses) of \({\mathcal {F}} \). Then Reiter’s result can be recast to show that \(M\in {\textit{AllMuses}} ({\mathcal {F}})\) iff M is a minimal hitting set of \({\textit{AllMcses}} ({\mathcal {F}})\), and dually, \({\mathcal {C}} \in {\textit{AllMcses}} ({\mathcal {F}})\) iff C is a minimal hitting set of \({\textit{AllMuses}} ({\mathcal {F}})\). Intuitively, we can see that if a mus M fails to hit an mcs C, then \(M \subseteq {\mathcal {F}} {-}C\), i.e., M is a subset of a satisfiable set and hence can’t be unsatisfiable. Similarly, if an mcs C fails to hit a mus M then \({\mathcal {F}} {-}C \supseteq M\) is a superset of an unsatisfiable set and hence can’t be satisfiable. It is also not hard to see that the duality between mcses and muses also holds for non-minimal sets. That is, every correction subset (not necessarily minimal) hits all unsatisfiable subsets and vice versa.

Although we have discussed muses and mcses in the context of a fixed set of formulas \({\mathcal {F}} \) we will also be working with subsets of \({\mathcal {F}} \). It is useful to point out that if \({\mathcal {F}} '\subseteq {\mathcal {F}} \), then \({\textit{AllMuses}} ({\mathcal {F}} ')\subseteq {\textit{AllMuses}} ({\mathcal {F}})\) and in general \({\textit{AllMuses}} ({\mathcal {F}} ') \ne {\textit{AllMuses}} ({\mathcal {F}})\). Hence, if f is critical for \({\mathcal {F}} \) it is critical for all unsatisfiable subsets of \({\mathcal {F}} \) (f is critical iff it is contained in every mus).

An mcs \(C'\) of \({\mathcal {F}} '\subset {\mathcal {F}} \) is not necessarily an mcs of \({\mathcal {F}} \), however \(C'\) can always be extended to an mcs C of \({\mathcal {F}} \). In particular, we can add the formulas of \({\mathcal {F}} \setminus {\mathcal {F}} '\) to \({\mathcal {F}} '\) one at a time. If \(C'\) is no longer a correction subset of \({\mathcal {F}} '\cup \{f\}\) we add f to \(C'\). At each stage the augmented \(C'\) is an mcs of the augmented \({\mathcal {F}} '\), and at the end \(C'\) has been extended to be an mcs of \({\mathcal {F}} \). Since we have not seen this observation previously in the literature, and its proof is illustrative of concepts needed in our algorithms, we provide a proof here.

Proposition 1

Let \(C'\in {\textit{AllMcses}} ({\mathcal {F}} ')\) and \(f\in {\mathcal {F}} \setminus {\mathcal {F}} '\). If \(C'\) is a correction subset of \({\mathcal {F}} ' \cup \{f\}\) it is an mcs of \({\mathcal {F}} ' \cup \{f\}\), and if it is not then \(C'\cup \{f\}\) is an mcs of \({\mathcal {F}} '\cup \{f\}\).

Proof

\(C'\) is a minimal correction subset of \({\mathcal {F}} '\) if and only if for every \(a\in C'\) there exists a mus \(M\in {\textit{AllMuses}} ({\mathcal {F}} ')\) such that \(M\cap C' = \{a\}\). That is, M is only hit by a, hence \(C'\) will no longer be a correction subset if we remove a. M serves as a witness that a is needed in \(C'\), and \(C'\) is minimal iff every member of \(C'\) has a witness. Since \({\textit{AllMuses}} ({\mathcal {F}} ') \subseteq {\textit{AllMuses}} ({\mathcal {F}} ' \cup \{f\})\), the witnesses for \(C'\) remain valid after adding f to \({\mathcal {F}} '\) and if \(C'\) corrects \({\mathcal {F}} '\cup \{f\}\) it must be an mcs of \({\mathcal {F}} '\cup \{f\}\). If \(C'\) does not correct \({\mathcal {F}} '\cup \{f\}\) then there are some muses in \({\textit{AllMuses}} ({\mathcal {F}} '\cup \{f\})\) that are not hit by \(C'\). But since \(C'\) hits all muses in \({\textit{AllMuses}} ({\mathcal {F}} ')\) these un-hit muses must contain f. So \(C'\cup \{f\}\) is a correction subset of \({\mathcal {F}} '\cup \{f\}\). Furthermore, any of these new muses can serve as a witness for f, and for every \(a\in C\) there is a witness for a in \({\textit{AllMuses}} ({\mathcal {F}} ')\) which cannot contain f. Hence, these witnesses remain valid when f is added to \(C'\), and \(C'\cup \{f\}\) is an mcs of \({\mathcal {F}} '\cup \{f\}\). \(\quad \square \)

Although we have given the above definitions in terms of an arbitrary theory \({\mathbb {T}} \), in the rest of this paper we will work with \(\mathbb {T}\) being ordinary propositional logic (\({ Prop }\)) and \({\mathcal {F}} \) being a set of clauses. In particular, our algorithms assume access to some basic facilities of modern sat solvers. Some of these facilities are also available in modern smt solvers, and thus some of our ideas could be lifted to theories handled by smt solvers.

3 Prior MUS Algorithms

Current state-of-the-art algorithms for computing muses have converged on versions of Algorithm 1.

figure a

Algorithm 1 operates on a working set of clauses \(W=({ unkn } \cup { crits })\) with the clauses of unknown status, \({ unkn } \), initially equal to \({\mathcal {F}} \). In the main while loop the status of each clause in \({ unkn } \) is resolved and its size reduced until \({ unkn } =\emptyset \). At this point W consists only of a set of clauses, \({ crits } \), all of which are known to be critical for W. As observed above this implies that \(W={ crits } \) is a mus.

The input assumption is that \(W={\mathcal {F}} \) is unsat, and this condition is an invariant of the algorithm. Each iteration of the main loop selects a clause of unknown status \(c\in { unkn } \) and tests the satisfiability of \(W\setminus \{c\}\). We have that \(W\setminus \{c\}\models \lnot c\), as W has no models. Hence, we can make the sat test of \(W\setminus \{c\}\) more efficient by adding the implied \(\lnot c\) (since c is a clause, \(\lnot c\) is a set of unit clauses which are particularly advantageous for a sat solver).

If \(W\setminus \{c\}\) is sat then we know that c is critical for W (and for all subsets of W that the algorithm subsequently works with). In this case we can additionally find more critical clauses by applying the technique of recursive model rotation (rmr) [7, 24]. Note that the satisfying model \(\pi \) returned is such that \(\pi \models ({ crits } \cup { unkn })\setminus \{c\}\) and \(\pi \not \models c\), which is the condition required for rmr to work correctly. Every newly identified critical clause is removed from \({ unkn } \) and added to \({ crits } \) thus reducing the number of iterations of the main loop.

If \(W\setminus \{c\}\) is unsat then there is some mus of W that does not contain c. The algorithm then focuses on finding one of these muses by removing c from W. Note that there might also be muses of W that do contain c so the final mus found depends on the order in which clauses of \({ unkn } \) are tested. One final optimization is that we can obtain an unsat core, \(\kappa \), from the sat solver. If that core did not depend on the added \(\lnot c\) clauses then we can reduce W by setting it to \(\kappa \). In this case it must be that \({ crits } \subseteq \kappa \): all the clauses of \({ crits } \) are critical for \(W\setminus \{c\}\). Hence, to make \(W=\kappa \) we simply need to remove from \({ unkn } \) all clauses not in \(\kappa \). This optimization is called clause set refinement [20].

Algorithm 1 is used in state of the art mus finding algorithms like [8, 20], and these systems also add a number of other lower level optimizations as described in [20]. The main difference between these mus finding systems is that some use a modified sat solver that keeps track of the resolution proof used to derive unsat [20]—the unsatisfiable subset \(\kappa \) is extracted from that proof—while others use selector variables for the input clauses and the technique of sat solving under assumptions to obtain \(\kappa \) [10].

4 MCS Based MUS Finding

In this section we present our new algorithms for finding muses. Our algorithms are based on the duality between mcses and muses mentioned in Sect. 2. This duality has been exploited in previous work, in particular in the camus system [16]. However, in that prior work the first step was to compute all mcses of the input formula \({\mathcal {F}} \), \({\textit{AllMcses}} ({\mathcal {F}})\), after which muses were found by finding minimal hitting sets of \({\textit{AllMcses}} ({\mathcal {F}})\). This first step is very expensive, and sometimes cannot be completed since there can be exponential number of mcses. So camus is not very effective for the task of finding a single mus. In this work we revisit this duality to arrive at algorithms that do not require an exhaustive enumeration of all mcses.

4.1 Finding a Single MUS

Algorithm 2 is our new algorithm for finding a single mus. Like Algorithm 1, it operates on the working set of clauses \(W=({ unkn } \cup { crits })\) with the clauses of unknown status, \({ unkn } \), initially equal to \({\mathcal {F}} \). In the main while loop a minimal correction subset of W is computed using Algorithm 3. Algorithm 3 works to find not just any mcs: it searches for an mcs contained entirely in \({ unkn } \). Every clause in the set \({ crits } \) is critical for W, and thus every clause in \({ crits } \) is a singleton mcs. We are not interested in finding these mcses. If there is no mcs in \({ unkn } \) it must be the case that W remains unsat even if all of \({ unkn } \) is removed from it. That is, \({ crits } \) is an unsatisfiable set all of whose members are critical—it is a mus.

If we do find an mcs, \({ CS } \), we then choose some clause from it, c, add c to \({ crits } \) and remove all of \({ CS } \) from \({ unkn } \). Algorithm 3 also returns the satisfying solution, \(\pi \) it found for \(W\setminus { CS } \) (verifying that \({ CS } \) is a correction subset). This solution can be used to find more criticals using rmr. Note that since \({ CS } \) is a minimal correction subset it must be the case that \(\pi \not \models a\) for every \(a\in { CS } \). Thus, \(\pi \models ({ crits } \cup { unkn })\setminus \{c\}\) and \(\pi \not \models c\), which is the condition required for rmr to work correctly. As will be described below we have developed an extension of standard rmr, em-rmr , that can find even more new criticals.

Clause set refinement can be used within this algorithm. Algorithm 3 (find-mcs) computes an unsatisfiable core whenever \(|{ CS } |\ge 1\). From this core an unsatisfiable set \(\kappa \subseteq { crits } \cup { unkn } \) can be extracted and used as in Algorithm 1 to reduce \({ unkn } \) to \({ unkn } \cap \kappa \). A simpler solution, however, is to do another sat call on the unsatisfiable set \({ crits } \cup { unkn } \) whenever \(|{ CS } |> 1\). In this case the sat solver has just refuted a closely related formula in find-mcs and can exploit its previously learned clauses to quickly refute \({ crits } \cup { unkn } \). The core it returns can then be intersected with \({ unkn } \). In our experiments, we confirmed that in the vast majority of cases the cost of this step is negligible typically taking less than a second cumulatively.

However, in those cases where the instance contains only one mus all mcses will have size 1, and we would never get to perform clause set refinement. We address this deficiency by forcing a sat call on \({ crits } \cup { unkn } \) whenever clause set refinement has not been performed for some time. The logic of when to do the sat call and returning a reduced \({ unkn } \) set is encapsulated in the refine-clause-set subroutine.

figure b

Theorem 1

If its input formula \({\mathcal {F}} \) is unsat, find-mcs correctly returns an mcs of \({ crits } \cup { unkn } \) contained in \({ unkn } \) if any exist, em-rmr correctly returns a set of clauses critical for \({ crits } \cup { unkn } \), and refine-clause-set correctly returns an unsatisfiable subset of \({ crits } \cup { unkn } \), then Algorithm 2 will return a mus of its input formula \({\mathcal {F}} \).

Proof

We show that two invariants hold in the main loop of Algorithm 2: (1) \({ crits } \cup { unkn } \) is unsat and (2) every clause in \({ crits } \) is critical for \({ crits } \cup { unkn } \).

Algorithm 2 terminates when find-mcs is unable to find a correction subset in \({ unkn } \). This happens when \({ crits } \cup { unkn } \) remains unsat even after all the clauses of \({ unkn } \) are removed, i.e., when it detects that \({ crits } \) is unsat (see Algorithm 3). In this case, we know that \({ crits } \) is an unsat set of clauses and from invariant (2) all of its members are critical, i.e., it is a mus. Hence, the correctness of Algorithm 2 follows from the invariants.

Initially \({ crits } =\emptyset \) and \({ unkn } ={\mathcal {F}} \), and \({\mathcal {F}} \) is unsat by assumption. So the invariants hold at the start of the main loop. Assume that they hold up until the \(i{-}1\)’th iteration of the main loop. If in the i’th iteration we fail to find an mcs contained in \({ unkn } \), then \({ crits } \) is unsat and unchanged from the \(i{-}1\)’th iteration. So invariant (1) holds and by induction so does invariant (2).

Otherwise, let \({ CS } \) be the mcs returned by find-mcs with \({ CS } \subseteq { unkn } \). \({ CS } \) is an mcs of \(W={ crits } \cup { unkn } \), therefore there is a witness \(M\in {\textit{AllMuses}} (W)\) for every \(c\in { CS } \) with \(M\cap { CS } = \{c\}\). Algorithm 2 updates \({ crits } \) to \({ crits } \cup \{c\}\) (for some \(c\in { CS } \)) and \({ unkn } \) to \({ unkn } \setminus { CS } \). Let this updated set \({ crits } \cup { unkn } \) be \(W'= W\setminus {{ CS }} \cup \{c\}\). We have that \(M\subseteq W'\) so invariant (1) continues to hold. Furthermore, let \(M'\in {\textit{AllMuses}} (W')\) be any mus of \(W'\). Since \({\textit{AllMuses}} (W')\subseteq {\textit{AllMuses}} (W)\), \(M'\) is also a mus of W. Hence \(M'\) must be hit by the mcs \({ CS } \) and since \(W'\) only contains c from \({ CS } \) we must have \(c\in M'\). This shows that c hits all \({\textsc {muses}} \) of \(W'\), i.e., removing it from \(W'\) removes all \({\textsc {muses}} \) from \(W'\) making \(W'\) sat. That is, c is critical for \(W'={ crits } \cup { unkn } \), and invariant (2) continues to hold.

Finally since we are assuming that em-rmr is correct, the invariants are preserved after em-rmr moves some clauses from \({ unkn } \) to \({ crits } \). The call to refine-clause-set cannot affect invariant (2) and since we assume that it is correct, it preserves also invariant (1).       \(\blacksquare \)

Finding a Constrained MCS. There are two state of the art algorithms for finding mcses, CLD [17] and Relaxation Search [3]. Both can be modified to find an mcs in a particular subset of the input clauses. We tried Relaxation Search but found that an approach that is similar to CLD, but not identical, worked best for our purposes. The resulting Algorithm 3 finds an mcs of the union of its two input clause sets, \({ crits } \) and \({ unkn } \) that is constrained to be contained in \({ unkn } \).

figure c

Initially a sat test is performed on \({ crits } \). If \({ crits } \) is unsat, then there is no correction subset contained in \({ unkn } \) so the algorithm returns null. Otherwise, we have a satisfying model \(\pi \) of \({ crits } \). The set of clauses falsified by any model is always a correction subset, and for \(\pi \) this correction subset, \({ CS } \), is contained in \({ unkn } \). The algorithm makes \({ CS } \) minimal by a sequence of sat calls, each one asking the sat solver to find a new model that falsifies a proper subset of clauses from the previous model. At each iteration, \({ CS } \) is updated to be the reduced set of falsified clauses. This continues until a model cannot be found or \({ CS } \) is reduced down to one clause. If a model cannot be found this means that adding any clause of \({ CS } \) to \(({ crits } \cup { unkn })\setminus { CS } \) yields an unsatisfiable formula, i.e., \({ CS } \) is an mcs. If \({ CS } \) is reduced to one clause then that clause must be an mcs since \({ crits } \cup { unkn } \) is unsatisfiable, and an invariant of the algorithm is that \({ CS } \) is always a correction set of \({ crits } \cup { unkn } \).

The main difference between Algorithm 3 and CLD of [17] lies in the encoding of \( atLeastOneIsTrue({ CS }) \) constraint passed to the sat solver. In CLD this constraint is encoded as one large clause that is the disjunction of all of the clauses in \({ CS } \). \(\pi \) falsifies all clauses of \({ CS } \), so it must falsify their disjunction, therefore this disjunction is not a tautology. Furthermore, when the disjunction is satisfied at least one more clause of \({ CS } \) must also be satisfied. In Algorithm 3 we instead add a selection variable to each clause of \({ CS } \). That is, each clause \(c_i\in { CS } \) is transformed into the clause \(c_i^+ = c_i\vee \lnot s_i\), where \(s_i\) is a new variable not appearing elsewhere in the formula. Making \(s_i\) true strengthens \(c_i^+\) back to \(c_i\), while making it false satisfies \(c_i^+\), effectively removing it from the CNF. With these selector variables \( atLeastOneIsTrue({ CS }) \) can be encoded as a clause containing all of the selector variables: \(\bigvee _{c_i\in { CS }} s_i\).

In addition, we found that in 90 % of cases when the sat call is able to find an improving model it was able to do so without backtracking (no conflicts were found). Hence, a lot of the time of the solver is spent in descending a single branch that changes little between iterations of Algorithm 3. This time can be significantly reduced if we backtrack only to the point where the branches diverge. This is similar to techniques used already in general sat solving for improving restarts [21] and in incremental sat solving for reducing the overhead of assumptions [1]. We found that these two simple changes had a surprisingly positive effect on efficiency.

Recursive Model Rotation (RMR). If \({\mathcal {F}} \) is unsatisfiable then it follows from the definition that a clause c is critical for \({\mathcal {F}} \) if and only if there exists a model \(\pi \) such that \(\pi \models {\mathcal {F}} \setminus \{c\}\). Hence, if in Algorithm 1 or Algorithm 2, we find for the current set of working clauses \(W=({ unkn } \cup { crits })\) a model \(\pi \) such that \(\pi \models W\setminus \{c\}\) we know that c is critical for W.

The technique of rmr [7] is to examine models that differ from \(\pi \) by only one assignment to see if we can find a model that witnesses the criticality of a clause different from c whose status is still undetermined. This is accomplished by flipping \(\pi \)’s assignments to the variables of c one by one. Each such flipped model satisfies c and can be checked to see if it falsifies only one other unknown clause. If such a clause \(c'\) is found, then \(c'\) is now known to be critical, and we can recursively flip the model that witnesses its criticality. Recursive model rotation has been found to be very effective in mus finding, eliminating many sat tests. Eager model rotation (ermr) [20] improves rmr by allowing it to falsify a critical clause, which may enable further rotations.

We have found that we can effectively find more critical clauses than ermr using Algorithm 4. This algorithm first runs ermr and then uses a sat solver to find a model that witnesses the criticality of a clause of unknown status. This is done by using a standard encoding of an “at most one” constraint over the negation of the selector variables of the clauses currently in \({ unkn } \). This forces the model to satisfy all clauses of \({ crits } \cup { unkn } \) except one. (The model must falsify that remaining clause as \({ crits } \cup { unkn } \) is unsatisfiable). This sequence of sat calls can in principle find all critical clauses, but it can sometimes take too long. In practice, we put a strict time bound on the sat call, and found that within that bound we could still find a useful number of additional critical clauses. As we will show in Sect. 5, this method can sometimes hinder performance, but also allows us to solve some instances that were otherwise too hard.

figure d

4.2 Finding All MUSES

We have also developed an algorithm for finding all muses. Our algorithm exploits the idea of using a sat formula to represent a constrained collection of sets. This idea was also used in the marco system which also enumerates all muses [15]. Specifically, if we regard each propositional variable as being a set element, then the set of variables made true by any model can be viewed as being a set. The set of satisfying models then represents a collection of sets.

In marco, all muses are enumerated by maintaining a sat formula \({ ClsSets }\) which contains a variable \(s_i\) for each clause \(c_i\in {\mathcal {F}} \). Clauses are added to \({ ClsSets }\) to exclude all already found muses as well as all supersets of these muses. For example, if \(M=\{c_1, c_3, c_4\}\) is a known mus then the clause \((\lnot s_1 \vee \lnot s_3 \vee \lnot s_4)\) ensures that every satisfying model of \({ ClsSets } \) excludes at least one clause of M—this blocks M and all supersets of M from being solutions to \({ ClsSets } \). A sat test is preformed on \({ ClsSets } \) which extracts a subset \({\mathcal {F}} '\) of \({\mathcal {F}} \) not containing any known mus. If \({\mathcal {F}} '\) is unsat one of its muses is extracted using Algorithm 1 and then blocked in \({ ClsSets }\), otherwise marco grows \({\mathcal {F}} '\) into an mss-mcs pair \(\langle S, {\mathcal {F}} \setminus S \rangle \) and a clause is added to \({ ClsSets } \) to block \({\mathcal {F}} \setminus S\) and all of its supersets. For example, for a correction subset \(C=\{c_1, c_3, c_4\}\) the clause \((s_1\vee s_3\vee s_4)\) is added to \({ ClsSets }\). When \({ ClsSets } \) becomes unsat, all muses have been enumerated.

figure e

Algorithm 5 is our new algorithm for enumerating all muses of an unsatisfiable formula \({\mathcal {F}} \). The high level structure of Algorithm 5 is similar to that of marco but rather than treating the mus extraction procedure as a black box, it records the (not necessarily minimal) correction subsets discovered during the MUS procedure and uses them to accelerate the extraction of future muses. In particular, muses and mcses are blocked in the same way as in marco. Hence, at any stage the set \({\mathcal {F}} '\) obtained from a sat solution of \({ ClsSets }\) has the properties (a) \({\mathcal {F}} '\) does not contain any known muses and (b) \({\mathcal {F}} '\) hits all known correction subsets. We want \({\mathcal {F}} '\) to hit all known correction subsets as otherwise \({\mathcal {F}} '\) cannot contain a mus. When \({ ClsSets }\) becomes unsatisfiable all muses have been enumerated (and blocked).

Given a sat solution \({\mathcal {F}} '\) of \({ ClsSets }\), we extract a mus using a procedure similar to Algorithm 2. In addition, however, we mirror the removal of clauses from \({ unkn }\) by setting the corresponding variable in \({ ClsSets }\) to false. Unit propagation in \({ ClsSets }\) may then give us more variables that can be moved to \({ crits }\), because previously discovered correction sets must be hit. Once a mus is constructed, all these assignments to \({ ClsSets }\) are retracted.

One complication that arises in comparison to Algorithm 2 is that when we discover an mcs, it is only an mcs of \({ crits } \cup { unkn } \), but we can only add mcses of \({\mathcal {F}} \) to \({ ClsSets }\). Therefore, we need to extend each mcs that we discover to an mcs of \({\mathcal {F}} \). The function extend-cs does this by adding to \({ CS } \) all clauses of \({\mathcal {F}} \setminus ({ crits } \cup { unkn })\) that were violated by \(\pi \). We choose not to minimize this \({ CS }\), as it can be costly especially if \({\mathcal {F}} \) is much bigger than \({ crits } \cup { unkn } \).

An additional insight arising from the ideas of Relaxation Search [3] is that if while solving \({ ClsSets } \) we force the sat solver to always set its decision variables to true, then the set \({\mathcal {F}} '\) we obtain will be a maximal set satisfying (a) and (b) above. Thus the set of excluded clauses \({\mathcal {F}} ^c = {\mathcal {F}} \setminus {\mathcal {F}} '\) must be a minimal hitting set of the set of known muses. Each known mus in \({ ClsSets }\) forces the exclusion of at least one clause. Thus \({\mathcal {F}} ^c\), is a hitting set of the known muses. Since setting all decision variables to true causes the inclusion of clauses, all exclusions must be forced by unit propagation. This means that each excluded clause arises from a mus all of whose other clauses have already been included in \({\mathcal {F}} '\). That is, for each excluded clause c in \({\mathcal {F}} ^c\) there is some known mus M such that \(M\cap {\mathcal {F}} ^c = \{c\}\). This shows that \({\mathcal {F}} ^c\) is minimal.

Theorem 2

If its input formula \({\mathcal {F}} \) is unsat, All-MCS-MUS returns all MUSes of its input formula \({\mathcal {F}} \).

Proof

(Sketch). First note that all muses and all msses are solutions of \({ ClsSets }\). At each iteration, it produces either a satisfiable set, whose complement is an mcs, or an unsatisfiable set which is reduced to a mus. Each is subsequently blocked so cannot be reported again, nor can any of its supersets. Additionally, the inner loop generates correction subsets, which it blocks in \({ ClsSets }\), without checking if they are mcses of \({\mathcal {F}} \). If these are mcses then they will not be produced as solutions of \({ ClsSets }\). So the algorithm will produce only muses and mcses before \({ ClsSets }\) becomes unsat. Additionally, it will produce all muses, as this is the only way to block such solutions.

It remains to show that it produces correct mcses and muses. For mcses, it follows from the fact that the formula is satisfiable and the solution is maximal. For muses, we only need to show that unit propagation in \({ ClsSets }\) produces critical clauses. Indeed, all correction sets that are blocked in \({ ClsSets }\) are either produced by solutions of \({ ClsSets }\) itself or as mcses of some subset of \({\mathcal {F}} \), extended to a correction set of \({\mathcal {F}} \). When such a blocking clause becomes unit, it means that exactly one of the clauses of the corresponding correction set remains in \({ crits } \cup { unkn } \). A mus must hit all correction sets, so the sole remaining clause is critical for \({ crits } \cup { unkn } \). The correctness of the rest of the mus extraction procedure follows from the correctness of Algorithm 2.       \(\blacksquare \)

5 Empirical Evaluation

We implemented all of our new algorithms C++, and evaluated them against state-of-the-art algorithms for the corresponding tasks. We ran all experiments on a cluster of 48-core Opteron 6176 nodes at 2.3 GHz having 378 GB RAM.

Discovering MCSES. The state of the art in discovering mcses is cld [17] and Relaxation Search (rs) [3]. We compared Algorithm 3, (denoted mcscl) using minisat as the sat solver without preprocessing [10], against cld and rs in the tasks of identifying a single mcs and generating all mcses of a formula, on a set comprising 1343 industrial instances from sat competitions and structured maxsat instances from maxsat evaluations [17]. We show cactus plots comparing the three algorithms in both tasks in Fig. 1, with a timeout of 1800 seconds.

Fig. 1.
figure 1

Number of solved instances against time for (a) generating a single mcs and (b) generating all mcses of a formula.

We first note that there is a relatively small window in which the algorithms may differentiate. In the case of discovering a single mcs, more than 1200 instances are solved instantaneously by all 3 algorithms, while some 20 of them are out of reach for all. Regardless, mcscl is faster than the other two algorithms, for easy instances as well as hard ones and finds an mcs in 17 more instances than cld and 5 more instances than rs. Similarly, in the case of discovering all mcses, all 3 algorithms solve approximately 400 instances in less than a second, while 700 have too many mcses to enumerate. In this case, mcscl again outperforms both the other alternatives, finding all mcses in 15 more instances than rs and 9 more instances than cld.

Discovering a Single MUS.

For this task, we used a set of 324 instances assembled by Belov et al. [5] for the evaluation of the tool dmuser. We tested implementations of mcs-mus that used either minisat or glucose [2] as the backend sat solver both with preprocessing enabled. We modified these solvers to bound time spent in preprocessing to 5 % of total runtime. We evaluated mcs-mus with em-rmr or with only eager model rotation. We compared against muser [8] using minisat and using glucose, and against haifa-muc [20] (based on minisat). For all algorithms, we preprocess the instances by trimming them using glucose and the tool DRAT-trim, which is a particularly effective heuristic clause set refinement method, but which cannot prove minimality and rarely produces a minimal mus. We also compare against dmuser [5] a system that augments a “core” mus extraction algorithms with more elaborate trimming techniques. dmuser yields significant improvements to muser and haifa-muc and potentially could also improve mcs-mus. However, we have not, as yet, integrated mcs-mus into dmuser to test this. The timeout in this experiment—including trimming—was set to 3600 seconds. Results are shown in Fig. 2.

Fig. 2.
figure 2

Number of solved instances against time for extracting a single mus.

Our first observation is that the combination of minisat and assumption-based solving is deficient, as is evident by the very poor performance of muser with minisat. Nevertheless, mcs-mus with minisat is comparable to both haifa-muc and muser with glucoseFootnote 1. We also see that em-rmr improves performance overall in this case, yielding the second best combination among core algorithms. When paired with its glucose backend, mcs-mus becomes the overall best algorithm for this task, surpassing even ITr-HM-A, the best configuration of dmuser reported in [5]. However, the improvement of mcs-mus when moving from minisat to glucose is not as dramatic as that of muser. It is, however, clearly ahead of other core algorithms and although it solves just 6 more instances than the next closest algorithm it does so significantly faster and even solves 2 more instances than dmuser. Interestingly, em-rmr improves mcs-mus with minisat but makes things worse when glucose is used.

Discovering All MUSES. Here we compare Algorithm 5 (mcs-mus-all) against the state of the art for discovering multiple (potentially all) muses, marco [15]. We use only the minisat backend, as that is what marco is based on, with the additional optimization that for every unsatisfiable subset that we minimize, we create a copy of the sat solver in which we can do destructive updates. This is implicit in marco, which uses an external tool for extracting a mus.

We used the set of benchmarks from the MUS track of the 2011 sat competitionFootnote 2 (without trimming) and ran both algorithms for 3600 seconds on each instance. In Fig. 3 we show scatter plots of the time taken by each algorithm to generate the first mus, of the time taken to differentiate between an instance with one mus or many and of the number of muses generated within the timeout. Determining whether an instance contains one mus or many involves either successfully terminating generation or generating a second mus.

Fig. 3.
figure 3

(a) Time for finding a single mus, (b) Time to differentiate between single-mus and multiple-mus instances, (c) number of muses generated, in logscale. In all cases, points above the line indicate mcs-mus-all was better.

We see that mcs-mus-all is more effective than marco at generating the first mus and differentiating between instances with a single mus or many muses. Indeed, it finds a mus in 20 more instances than marco and differentiates 17 more instances. However, when marco can generate several muses, it is typically more efficient at doing so, especially for very large numbers of muses. We conjecture that in these cases, extracting a single mus is so efficient, that the overhead of keeping track of the correction sets that mcs-mus-all generates outweighs their potential benefit. This means that when the objective is to generate a variety of muses quickly on instances of moderate difficulty, mcs-mus-all is to be preferred, but for large numbers of muses in easy instances, marco is preferable.

6 Conclusions

We have proposed a novel approach to extracting muses from unsatisfiable formulas. We exploited the well-known hitting set duality between correction sets and unsatisfiable subsets and used a greedy approach which, given an unhit mcs, can extend a set of clauses so that they are guaranteed to be a subset of a mus. We further extended this algorithm to generating all muses. These developments hinge in part on our new very efficient mcs extraction algorithm. In all cases, we have demonstrated that the new algorithms outperform the state of the art. Despite this, there is little tuning or low level optimizations in our implementation, in contrast to the current state of the art [20]. This suggests that in future work we explore such optimizations to widen the gap.