Testing for Distinguishing Repair Candidates in Spreadsheets – the Mussco Approach
Abstract
Companies and other organizations use spreadsheets regularly as basis for evaluation or decisionmaking. Hence, spreadsheets have a huge economical and societal impact and fault detection, localization, and correction in the domain of spreadsheet development and maintenance becomes more and more important. In this paper, we focus on supporting fault localization and correction given the spreadsheet and information about the expected cell values, which are in contradiction with the computed values. In particular, we present a constraint approach that computes potential root causes for observed behavioral deviations and also provide possible fixes. In our approach we compute possible fixes using spreadsheet mutation operators applied to the cells’ equations. As the number of fixes can be large, we automatically generate distinguishing test cases to eliminate those fixes that are invalid corrections. In addition, we discuss the first results of an empirical evaluation based on a publicly available spreadsheet corpus. The approach generates on average 3.1 distinguishing test cases and reports 3.2 mutants as possible fixes.
Keywords
Fault localization Spreadsheet debugging Distinguishing testcases Spreadsheet mutations1 Introduction
Spreadsheets are a flexible endusers programming environment. “Enduser” programmers vastly outnumber professional ones: the US Bureau of Labor and Statistics estimates that more than 55 million people used spreadsheets and databases at work on a daily basis by 2012 [14]. 95 % of all U.S. companies use spreadsheets for financial reporting [19], and 50 % of all spreadsheets are the basis for decisions.
Numerous studies have shown that existing spreadsheets contain redundancies and errors at an alarmingly high rate, e.g., [6]. This high error rate can be explained with the lack of fundamental support for testing, debugging, and structured programming in the spreadsheet world. Errors in spreadsheets may entail a serious economical impact, causing yearly losses worth around 10 billion dollars [18]^{1}. This paper improves the stateoftheart in spreadsheet debugging by proposing an approach for correcting faults in spreadsheets.
In this paper, we use constraintbased techniques for spreadsheet debugging [3, 13]. These techniques take as input a faulty spreadsheet and a test case^{3} that reveals the fault in order to compute a set of diagnosis candidates (cells). The spreadsheet and the test case are converted into a constraint satisfaction problem (CSP). A constraint or SMT (satisfiability modulo theories) solver is used to obtain the set of diagnosis candidates. A major limitation of these approaches is that they yield many diagnosis candidates. To avoid this problem, we propose to integrate testing for restricting the number of diagnosis candidates. In particular, we propose to compute possible corrections of the program (using mutation techniques) and from these distinguishing test cases. A test case is a distinguishing test case if and only if there is at least one output variable where the computed value of two mutated versions of a spreadsheet differ on the same input. We have two main contributions: (1) We propose Mussco (Mutation Supported Spreadsheet COrrection), an approach to fault localization in spreadsheets that relies on constraintbased reasoning to provide suggestions for possible fixes by applying spreadsheet mutation operators. Since the number of such mutants can be large, our approach automatically generates distinguishing test cases to eliminate mutants that are invalid corrections. (2) We carried out an empirical evaluation using the publicly available Integer Spreadsheet Corpus. Results show that on average 3.1 distinguishing test cases are generated and 3.2 mutants are reported as possible fixes. On average, generating mutants and distinguishing test cases requires 47.9 seconds, rendering the approach applicable as a realtime application.
2 Basic Definitions
In this paper, we rely on the spreadsheet language \(\mathcal L\) defined by Hofer et al. [11]. We refer the interested reader to that paper for more information about the syntax and semantics of the underlying spreadsheet language. For the sake of completeness, we state the most important concepts and definitions in the following paragraphs.
Every spreadsheet is a matrix of cells that are uniquely identifiable using their corresponding column and row number. The function \(\varphi \) maps the cell names from a set \( CELLS \) to their corresponding position (x, y) in the matrix where x represents the column and y the row number. The functions \(\varphi _x\) and \(\varphi _y\) return the column and row number of a cell respectively. Each cell \(c \in CELLS \) has a corresponding value \(\nu (c)\) and an expression \(\ell (c)\). The value of a cell can be either undefined \(\epsilon \), an error \(\bot \), or any number, Boolean or string value. The expression of a cell \(\ell (c)\) can either be empty or an expression written in the language \(\mathcal L\). The value of a cell c is determined by its expression. If no expression is explicitly declared for a cell, the function \(\ell \) returns \(\epsilon \) while the function \(\nu \) returns 0.
Definition 1
(Spreadsheet). A countable set of cells \({\varPi }\subseteq CELLS \) is a spreadsheet if all cells in \({\varPi }\) have a nonempty corresponding expression or are referenced in an expression, i.e., \(\forall c \in {\varPi }: (\ell (c) \not = \epsilon ) {\,\vee \,}(\exists c' \in {\varPi }: c \in \rho (\ell (c')))\).
This definition restricts spreadsheets to be finite. For our approach, we only consider loopfree spreadsheets, i.e., spreadsheets that do not contain cycles within the computation. Therefore, we introduce the notation of data dependence between cells, and the data dependence graph.
Definition 2
(Direct dependence). Let \(c_1, c_2\) be cells of a spreadsheet \({\varPi }\). The cell \(c_2\) directly depends on cell \(c_1\) if and only if \(c_1\) is used in \(c_2\)’s corresponding expression, i.e., \(dd(c_1, c_2) \leftrightarrow (c_1 \in \rho (\ell (c_2)))\).
Definition 3
(Data dependence graph). The data dependence graph (DDG) of a spreadsheet \({\varPi }\) is a tuple (V, A) with V being a set of vertices comprising exactly one vertex \(n_c\) for each cell \(c \in {\varPi }\), and A being a set of arcs \((n_{c_1}, n_{c_2})\) for all \(n_{c_1}, n_{c_2}\) where there is a direct dependence between the corresponding cells \(c_1\) and \(c_2\) respectively, i.e. \(A = \bigcup (n_{c_1}, n_{c_2}) \text {where }n_{c_1}, n_{c_2} \in V \wedge dd(c_1,c_2)\).
From this definition, we are able to define general dependence between cells. Two cells of a spreadsheet are dependent if and only if there exists a path between the corresponding vertices in the DDG. A spreadsheet \({\varPi }\) is feasible if and only if its DDG is acyclic. From here on, we assume that all spreadsheets we consider for debugging are feasible. Hence, we use the terms spreadsheet and feasible spreadsheet synonymously. For debugging, we have to define test cases for distinguishing faulty spreadsheets from correct spreadsheets.
Definition 4
All formula cells of a spreadsheet that serve neither as input nor as output are called intermediate cells. With the definition of inputs and outputs, we can now define test cases.
Definition 5
(Test case). A test case T for a spreadsheet \({\varPi }\) is a tuple (I, O) where I is a set of pairs (c, v) specifying the values for all \(c \in inputs({\varPi })\) and O is a set of pairs (c, e) specifying the expected values for some output cells. T is a failing test case for spreadsheet \({\varPi }\) if there exists at least one cell c where the expected value e differs from the computed value \(\nu (c)\) when using I on \({\varPi }\).
We say that a test case is a passing test case if all computed values are equivalent to the expected values.
Definition 6
(Spreadsheet debugging problem). A spreadsheet \({\varPi }\) and a failing test case T form a spreadsheet debugging problem.
Example 1
The test case T with \(I =\{(\mathtt{B2},120), (\mathtt{B3},60),\) \((\mathtt{B4},72), (\mathtt{B5},2)\}\) and \( O = \{(\mathtt{B8},2160)\}\) is a failing test case for the Cardiogenic shock estimator spreadsheet. This test case together with the spreadsheet forms a debugging problem.
A solution of a spreadsheet debugging problem \(({\varPi },T)\) is a set of cells that explain the faulty behavior. In particular, we say that an explanation \({\varPi }^E\) is itself a spreadsheet comprising the same cells as \({\varPi }\) but different cell expressions that make the test case T a passing test case for \({\varPi }^E\).
Example 2
A spreadsheet \({\varPi }_1\) where the expression of cell B6 is changed to B2  B3 is obviously an explanation that makes the test case T a passing one. However, a spreadsheet \({\varPi }_2\) where we change the expression of B7 to 30 * B6 * B4 is an explanation as well.
3 Constraint Satisfaction Problem
Example 3
The constraint representation of our example from Fig. 1 is: \(B2==120\), \(B3==60\), \(B4==72\), \(B5==2\), \(ab_{B6} {\,\vee \,}B6 = B2 / B3\), \(ab_{B7} {\,\vee \,}B7 = B6* B4\), \(ab_{B8} {\,\vee \,}B8 = B7 / B5\), \(B8==2160\).
Since spreadsheets must be finite, the Convert algorithm terminates. The computational complexity of the algorithm is \(O(CELLS \cdot L)\) where L is the maximum length of an expression. For computing diagnoses, let SD be the obtained constraint representation for a spreadsheet \({\varPi }\). A diagnosis \(\Delta \) is a subset of the cells contained in \({\varPi }\) such that \(\text {SD} \cup \{\lnot ab_c  c \in {\varPi }\setminus \Delta \} \cup \{ ab_c  c \in \Delta \}\) is satisfiable. We use an SMT solver for computing solutions for a given CSP. The theoretic background of using SMT solvers for CSPs is explained by Liffiton and Sakallah [15, 16].
4 Mutation Creation
With the previously described fault localization technique, the user only gets the information which cells have to be changed, but not how the cells have to be changed. We are confident that the information of how cells have to be changed is important for the user. Therefore, we propose an approach that automatically creates versions, i.e. mutants, of the spreadsheet that satisfy the given test case.
Weimer et al. [23] introduced genetic programming for repairing C programs. Similar to them, we make assumptions how to restrict the search space. For example, we perform mutations on the cone for a given cell^{4} and Weimer et al. make mutations on the weighted path. In addition, Weimer et al. assume that the programmer has written the correct statement somewhere else in the program. We assume that when a spreadsheet programmer referenced the wrong cell, the correct cell is in the surrounding of the referenced cell. However, we differ from their genetic programming approach as we do not use crossover and randomness for selecting mutations.

replace constant with reference or other constant

replace reference with constant or other reference

replace arithmetical (relational) operator with other arithmetical (relational) operator

replace function with other functions of the same arity

resize areas
We are aware that these mutation operators are not able to correct all faulty spreadsheets. In particular, the creation of completely new formulas is up to future work.

Mutations are only indicated for cells that are contained in the cone of any erroneous output cell.

When replacing references with constants, we do not immediately compute the concrete constant. Instead, we use the information, that there exists a constant that could eliminate the observed misbehavior. Only if we present a mutant to the user, we compute a concrete value for that constant. The reason for this delayed computation is the fact that there often exist many constants that satisfy the primary test case. During the distinguishing test case creation process, we gain additional information, which helps to reduce the number of constants.
 When changing references or resizing areas, we make use of the following assumption: If the user made a mistake when indicating the reference or area, the intended reference(s) might be in the surrounding of the originally indicated reference(s). We define the surrounding of a cell c as follows:We model into our CSP that the reference to the cell is either correct or that it should be replaced by one of the cells in the surrounding. In case of an area, we define the surrounding of the area as follows:$$\begin{aligned} \small { \textsc {Surround}(c) {\,\equiv _{{def}}\,}\left\{ c_1 \in CELLS \left \begin{array}{l} \varphi _x(c)2 \le \varphi _x(c_1) \le \varphi _x(c)+2 {\,\wedge \,}\\ \varphi _y(c)2 \le \varphi _y(c_1) \le \varphi _y(c)+2 \\ \end{array} \right. \right\} .} \end{aligned}$$For areas, we allow to select/deselect any cell in the surrounding. This allows both, the shrinking and enlargement of areas and noncontinuous areas.$$\begin{aligned} \small { \textsc {Surround}(c_1:c_2) {\,\equiv _{{def}}\,}\left\{ c_3 \in CELLS \left \begin{array}{l} \varphi _x(c_1)2 \le \varphi _x(c_3) \le \varphi _x(c_2)+2 {\,\wedge \,}\\ \varphi _y(c_1)2 \le \varphi _y(c_3) \le \varphi _y(c_2)+2 \\ \end{array} \right. \right\} .} \end{aligned}$$

We allow only one mutation per cell.
These restrictions do not allow to find suited mutants for all given faulty spreadsheets. However, they allow the approach to be used in practice.
Example 4
The extended constraint representation for the cell B6 of our Cardiogenic shock estimator from Fig. 1 changes from \(ab_{B6} {\,\vee \,}B6 = B2 / B3\) to: \((ab_{B6} {\,\wedge \,}(B6 = B2 + B3 {\,\vee \,}B6 = B2  B3 {\,\vee \,}B6 = B1 / B3 {\,\vee \,}B6 = 5 / B3 {\,\vee \,}\dots ) ){\,\vee \,}B6 = B2 / B3\).
5 Computing Distinguishing Test Cases
Usually, there exists more than one possible correction. In practice, a large number of repair suggestions overwhelms the user. Consequently, there is a strong need for distinguishing such explanations. One way to distinguish explanations is to use distinguishing test cases. Nica et al. [17] define a distinguishing test case for two variants of a program as input values that lead to the computation of different output values for the two variants. When translating this definition to the spreadsheet domain, we have to search for constants that are assigned to inputs, which lead to different output values for the different explanations. The user (or another oracle) has to clarify which output values are correct.
Example 5
The following new input values form a distinguishing test case for the variants \({\varPi }_1\) and \({\varPi }_2\) of our running example: \(\ell (\mathtt{B2}) = \mathtt{30}\), \(\ell (\mathtt{B3}) = \mathtt{30}\), \(\ell (\mathtt{B4}) = \mathtt{30}\), \(\ell (\mathtt{B5}) = \mathtt{1}\). For these input values, \({\varPi }_1\) computes a value 0 for cell B8, where \({\varPi }_2\) would return 900.
Algorithm GetDistTestCase (Fig. 4) creates distinguishing test cases. This algorithm takes as input a spreadsheet and two mutated versions of that spreadsheet. The functions getInputCells and getOutputCells return the set of input and output cells for a given spreadsheet (Lines 1 and 2). In Lines 3 and 4, the mutants \(m_1\) and \(m_2\) are converted into their constraint representations. When creating a distinguishing test case, we have to exclude the input cells from the spreadsheet. Therefore, we only hand over the spreadsheet without the input cells to the function Convert. This function slightly differs from the Convert function from Fig. 2, because it takes two additional parameters: (1) the particular mutant in use and (2) a constant that acts as postfix for variables. This postfix is necessary to distinguish the constraint representation of \(m_1\) from that of \(m_2\): Each variable in the constraint system for mutant \(m_1\) (\(m_2\)) gets the postfix “_1” (“_2”). In Line 5, a constraint is created that ensures that the input of \(m_1\) is equal to the input of \(m_2\). In Line 6, a constraint is created that ensures that at least one output cell of \(m_1\) has a different value than the same output cell in \(m_2\). The function GetSolution calls the solver with these constraints (Line 8). This function either returns a distinguishing test case, UNSAT (in case of equivalent mutants) or UNKOWN (in case of undecidability).
The worstcase time complexity of the Algorithm from Fig. 3 is exponential in the number of cells (\(O\left( 2^{ CELLS }\right) \)). In practice, only solutions up to a certain size, i.e. single or double fault solutions, are relevant. The algorithm terminates: The outer whileloop (Line 2) is bound to the size of the spreadsheet. The whileloop in Line 6 is limited since there only exists a limited number of mutants that can be created and we do not allow to report mutants twice (Line 9). The innermost loop (Line 10) is limited since the number of mutants in M has to be greater or equal to two and the selected pair has not already been proven to be equivalent or undecidable. In each iteration of this loop, either a new pair is added to the equivalent or undecidable set (Lines 14 and 17) or the set M shrinks (Line 23). M must shrink because the return set of the function Filter (Line 22) contains at least on element, since the mutants \(m_1\) and \(m_2\) must compute different output values for the given test case.
6 Empirical Evaluation
We implemented a prototype in Java that uses Z3 [7] as solver. This prototype supports the conversion of spreadsheets with basic functionality (arithmetic and relational operators, the functions ‘IF’, ‘SUM’, ‘AVERAGE’, ‘MIN’, and ‘MAX’) into Z3 formula clauses.
For the evaluation, we used the publicly available Integer Spreadsheet Corpus [5]. This corpus comes with 33 different spreadsheets (12 artificially created spreadsheets and 21 reallife spreadsheets) and 229 mutants of these 33 basic spreadsheets. We excluded some spreadsheets from our evaluation, because Mussco was not able to generate the required mutation to correct the observed misbehavior. There are two reasons for this: (1) The correction requires more than one mutation within a single cell, which is currently not supported by our approach. (2) The required mutation operator is not implemented in Mussco. In the following empirical evaluation, we only consider the 73 spreadsheets where Mussco was able to compute the required mutation in order to correct the fault.
Structure and complexity of the evaluated spreadsheets
Name  Number of cells  Halstead complexity  

In  Out  Form.  \(\eta _1\)  \(\eta _2\)  \(N_1\)  \(N_2\)  difficulty  
amortization  16  1  16  5  33  31  67  5.4 
arithmetics00  10  1  8  1  23  11  29  1.3 
arithmetics01  9  1  11  2  23  14  34  2.4 
arithmetics02  13  1  16  1  36  21  50  1.2 
arithmetics03  19  1  35  1  64  45  99  1.1 
arithmetics04  23  2  24  1  59  51  98  1.0 
austrian_league  91  10  32  3  103  96  267  4.2 
bank_account  45  13  27  7  76  103  187  6.4 
birthdays  39  3  39  7  86  78  189  8.5 
cake  101  1  69  3  155  69  238  5.2 
comp_shopping  37  4  36  6  64  151  288  5.7 
conditionals01  9  1  11  5  25  34  65  4.8 
dice_rolling  31  3  21  4  40  99  190  3.8 
fibonacci  25  1  46  1  68  16  87  2.7 
matrix  51  1  13  3  23  17  67  5.9 
oscars2012  60  2  22  3  76  24  104  6.5 
prom_calculator  46  1  14  2  63  14  73  5.2 
shares  43  12  39  4  69  37  118  6.4 
shop_bedroom1  67  2  32  2  78  32  129  4.0 
shop_bedroom2  70  4  64  4  109  148  338  4.6 
training  34  3  53  4  93  99  223  4.5 
weather  70  5  41  6  131  89  231  7.8 
wimbledon2012  90  1  49  4  135  280  538  3.8 
Average  43.4  3.2  31.2  3.4  71.0  67.8  161.3  4.5 
In order to investigate a larger amount of spreadsheets, we decided to simulate the user interactions. Therefore, we use the original correct spreadsheets as oracles to determine the output values for the generated distinguishing test cases.
Figure 5(a) shows the amount of correction suggestions that are returned to the user. For 49 spreadsheets, only the correct mutation is returned to the user. On average, 3.2 mutants are reported to the user. For one faulty spreadsheet containing two faulty cells, Mussco determines 27 correction suggestions. Moreover, applying the algorithm to a spreadsheet with three faults results in 94 correction suggestions. The evaluation shows that in case of double or triple faults, Mussco finds a higher amount of equivalent solutions.
Figure 5(b) illustrates the number of generated distinguishing test cases. For 27 spreadsheets, only a single distinguishing test case is required. For 26 spreadsheets, two distinguishing test cases are necessary. For one spreadsheet, 29 distinguishing test cases have to be generated. This spreadsheet contains a double fault. Therefore, Mussco creates many mutants which have to be killed by the distinguishing test cases. On average, 3.1 distinguishing test cases are required.
We create a distinguishing test case as soon as we have two mutants available. Another possibility is to immediately compute all possible mutants of a particular size and afterwards generate the test cases. Does the implemented method perform better with respect to runtime? We suppose that adding more test cases to the constraint system decreases the number of mutants that are created and therefore decreases the total computation time. For clarifying our assumptions, we compare the two methods with respect to the number of generated mutants and the total computation time in the Figs. 6 and 7. Variant 1 denotes the version where we first compute all possible mutants. Variant 2 denotes the version described in Algorithm Mussco (Fig. 3). For six spreadsheets, Variant 1 results in a timeout. On average, Variant 1 creates 17.2 mutants while Variant 2 creates 5.2 mutants (when comparing only those spreadsheets without timeouts). However, when comparing the computation time (see Fig. 7), the two variants only slightly differ (expect for the six spreadsheets yielding a timeout when using Variant 1). It turns out, that decreasing the number of computed mutants through more test cases, increases the computation time per mutant. Nevertheless, we favor Variant 2 over Variant 1 since the user gets earlier a first response.
7 Related Work
Our approach is based on modelbased diagnosis [20], namely its application to (semi) automatic debugging. It uses a constraint representation and a constraint solver, which pinpoints software failures. Jannach and Engler [13] presented a modelbased approach that uses an extended hittingset algorithm and userspecified or historical test cases and assertions, to calculate possible error causes in spreadsheets.
GoalDebug [1] is a spreadsheet debugger for endusers. This approach generates a list of change suggestions for formulas that would result in a userspecified output. GoalDebug relies upon a set of predefined change inference rules. Hofer and Wotawa [12] also proposed an approach for generating repair candidates via genetic programming. In contrast to these approaches, we encode the mutation creation into a constraint satisfaction problem. In addition, we generate distinguishing test cases to keep the number of possible fixes small.
Ruthruff et al. [22] and Hofer et al. [10] propose to use spectrumbased fault localization for spreadsheet debugging. In contrast to Mussco, these approaches only identify the locations of faults instead of giving repair suggestions.
Spreadsheet testing is closely related to debugging. In the WYSIWYT system, users indicate correct/incorrect output values by placing a correct/faulty token in the cell [21]. The spreadsheet analysis tools of Abraham and Ewig [2] and Ahmad et al. [4] reason about the units of cells to find inconsistencies in formulas. The tools differ in the rules they employ and in the degree to which they require users to provide additional input. Ahmad’s tool requires users to annotate the spreadsheet cells with additional information. UCheck [2] fully automatically performs unit analysis by exploiting techniques for automated header inference.
8 Conclusions
Our spreadsheet debugging approach Mussco maps a spreadsheet into a set of constraints for computing potential diagnosis candidates. The approach makes use of mutations, i.e., small changes of formulas used in the spreadsheets, to create diagnosis candidates. These diagnosis candidates are further refined by generating distinguishing test cases.
Beside the theoretical foundations and the algorithms we also discuss the results obtained from an empirical evaluation where we are able to show that distinguishing test cases improve diagnosis of spreadsheets substantially. In particular, results show that on average 3.1 distinguishing test cases are generated and 3.2 mutants are reported as possible fixes. On average, the generation of the mutants and distinguishing test cases requires 47.9 seconds in total, rendering the approach applicable as a realtime application. In future work, we will extend the toolset (i) by supporting more functionality of spreadsheets, and (ii) by integrating it into a spreadsheet framework.
Footnotes
 1.
 2.
A cardiogenic shock is when the heart has been damaged so much that it is unable to supply enough blood to the organs.
 3.
A test case specifies values for the input cells as well as the expected values for the output cells.
 4.
A cone for a cell c is recursively defined as the union of all cones of the cells which are referenced in c and the cell c itself.
Notes
Acknowledgement
The work described in this paper has been funded by the Austrian Science Fund (FWF) project DEbugging Of Spreadsheet programs (DEOS) under contract number I2144 and the Deutsche Forschungsgemeinschaft (DFG) under contract number JA 2095/41.
References
 1.Abraham, R., Erwig, M.: GoalDebug: A spreadsheet debugger for end users. In: Proceedings of the 29th International Conference on Software Engineering ICSE 2007, pp. 251–260. IEEE Computer Society, Washington, DC, USA (2007)Google Scholar
 2.Abraham, R., Erwig, M.: UCheck: a spreadsheet type checker for end users. J. Vis. Lang. Comput. 18, 71–95 (2007)CrossRefGoogle Scholar
 3.Abreu, R., Hofer, B., Perez, A., Wotawa, F.: Using constraints to diagnose faulty spreadsheets. Softw. Qual. J. 23(2), 297–322 (2015)CrossRefGoogle Scholar
 4.Ahmad, Y., Antoniu, T., Goldwater, S., Krishnamurthi, S.: A type system for statically detecting spreadsheet errors. In: 18th IEEE International Conference on Automated Software Engineering (ASE 2003), pp. 174–183. IEEE Computer Society (2003)Google Scholar
 5.Ausserlechner, S., Fruhmann, S., Wieser, W., Hofer, B., Spork, R., Mühlbacher, C., Wotawa, F.: The right choice matters! SMT solving substantially improves modelbased debugging of spreadsheets. In: 2013 13th International Conference on Quality Software (QSIC 2013). pp. 139–148 (2013)Google Scholar
 6.Chadwick, D., Knight, B., Rajalingham, K.: Quality control in spreadsheets: a visual approach using color codings to reduce errors in formulae. Softw. Qual. Control 9(2), 133–143 (2001)CrossRefGoogle Scholar
 7.de Moura, L., Bjørner, N.S.: Z3: an efficient SMT solver. In: Ramakrishnan, C.R., Rehof, J. (eds.) TACAS 2008. LNCS, vol. 4963, pp. 337–340. Springer, Heidelberg (2008) CrossRefGoogle Scholar
 8.Dechter, R.: Constraint Processing. Morgan Kaufmann, San Mateo (2003)Google Scholar
 9.Halstead, M.H.: Elements of Software Science (Operating and programming systems series). Elsevier Science Inc., New York (1977) Google Scholar
 10.Hofer, B., Perez, A., Abreu, R., Wotawa, F.: On the empirical evaluation of similarity coefficients for spreadsheets fault localization. In: Journal of Automated Software Engineering  Special Issue on Realizing Artificial Intelligence and Software Engineering Synergies, vol. 22(1), pp. 47–74. Springer, US (2015)Google Scholar
 11.Hofer, B., Riboira, A., Wotawa, F., Abreu, R., Getzner, E.: On the empirical evaluation of fault localization techniques for spreadsheets. In: Cortellessa, V., Varró, D. (eds.) FASE 2013 (ETAPS 2013). LNCS, vol. 7793, pp. 68–82. Springer, Heidelberg (2013) CrossRefGoogle Scholar
 12.Hofer, B., Wotawa, F.: Mutationbased spreadsheet debugging. In: International workshop on program debugging (IWPD),  Supplemental Proceedings ISSRE 2013, pp. 132–137 (2013)Google Scholar
 13.Jannach, D., Engler, U.: Toward modelbased debugging of spreadsheet programs. In: Proceedings of the 9th Joint Conference on KnowledgeBased Software Engineering. JCKBSE 2010, pp. 252–264. Kaunas, Lithuania (2010)Google Scholar
 14.Ko, A.J., Abraham, R., Beckwith, L., Blackwell, A., Burnett, M., Erwig, M., Scaffidi, C., Lawrance, J., Lieberman, H., Myers, B., Rosson, M.B., Rothermel, G., Shaw, M., Wiedenbeck, S.: The state of the art in enduser software engineering. ACM Comput. Surv. 43(3), 21:1–21:4 (2011)CrossRefGoogle Scholar
 15.Liffiton, M.H., Sakallah, K.A.: Algorithms for computing minimal unsatisfiable subsets of constraints. J. Autom. Reasoning (JAR) 40(1), 1–33 (2008)zbMATHMathSciNetCrossRefGoogle Scholar
 16.Liffiton, M.H., Sakallah, K.A.: Generalizing coreguided maxSAT. In: Kullmann, O. (ed.) SAT 2009. LNCS, vol. 5584, pp. 481–494. Springer, Heidelberg (2009) CrossRefGoogle Scholar
 17.Nica, M., Nica, S., Wotawa, F.: On the use of mutations and testing for debugging. Software : practice & experience (2012). http://dx.doi.org/10.1002/spe.1142
 18.Panko, R.R.: Applying code inspection to spreadsheet testing. J. Manag. Inf. Syst. 16(2), 159–176 (1999)Google Scholar
 19.Panko, R.R., Port, D.: End user computing: The dark matter (and dark energy) of corporate IT. In: Proceedings of the 45th Hawaii International Conference on Systems Science (HICSS45 2012), pp. 4603–4612 (2012)Google Scholar
 20.Reiter, R.: A theory of diagnosis from first principles. Artif. Intell. 32(1), 57–95 (1987)zbMATHMathSciNetCrossRefGoogle Scholar
 21.Rothermel, K.J., Cook, C.R., Burnett, M.M., Schonfeld, J., Green, T.R.G., Rothermel, G.: WYSIWYT testing in the spreadsheet paradigm: an empirical evaluation. In: Proceedings of the 22nd International Conference on Software engineering ICSE 2000, pp. 230–239. ACM, New York (2000)Google Scholar
 22.Ruthruff, J., Creswick, E., Burnett, M., Cook, C., Prabhakararao, S., Fisher, II, M., Main, M.: Enduser software visualizations for fault localization. In: Proceedings of the 2003 ACM Symposium on Software visualization (SoftVis 2003), pp. 123–132. ACM, New York (2003)Google Scholar
 23.Weimer, W., Nguyen, T., Le Goues, C., Forrest, S.: Automatically finding patches using genetic programming. In: Proceedings of the 31st International Conference on Software Engineering ICSE 2009, pp. 364–374. IEEE Computer Society, Washington, DC (2009)Google Scholar