Abstract
Provenance information describes the origins and the history of data in its life cycle. Responsibility captures the notion of degree of causality and tells us which facts are the most influential in the lineage. Since responsibility cannot be computed by a relational query, the analysis of lineage becomes an essential tool to compute responsibility of tuples in the query results. We extend the definitions of causality and responsibility of a tuple t for the answer r to those of a set of tuples for the answer r, and Co-Trees to P-Trees for read-once functions. By using P-Trees, we develop an efficient algorithm to compute responsibilities of tuples in read-once formulas, and a novel algorithm to find top-k responsibility tuples in read-once functions. Finally, experimental evaluation on TPC-H data shows substantial efficiency improvement when compared to the state of the art.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Buneman, P., Khanna, S., Tan, W.: Why and where: A charaterization of data provenance. In: ICDT 2001, pp. 316–330 (2001)
Chapman, A., Jagadish, H.V.: Why not? In: SIGMOD 2009, pp. 523–534 (2009)
Cheney, J., Chiticariu, L., Tan, W.C.: Provenance in databases: Why, how, and where. Foundations and Trends in Databases 4(1), 379–474 (2009)
Chockler, H., Halpern, J.: Responsibility and Blame: A Structural-Model Approach. Journal of Artificial Intelligence Research 22(1), 93–115 (2004)
Chockler, H., Halpern, J., Kupferman, O.: What causes a system to satisfy a specification. ACM Transactions on Computational Logic 9(3), 1–26 (2008)
Cui, Y., Widom, J., Wiener, J.L.: Tracing the lineage of view data in a warehousing enviornment. ACM TODS 25(2), 269–332 (2000)
Eiter, T., Lukasiewicz, T.: Complexity results for structure-based causality. Artificial Intelligence 142(1), 53–89 (2002)
Fink, R., Han, L., Olteanu, D.: Aggregation in probabilistic databases via knowledge compilation. PVLDB 5(5), 490–501 (2012)
Golumbic, M., Mintz, A., Rotics, U.: Factoring and recognition of read-once functions using cographs and normality and readability of functions associated with partial k-trees. Discrete Applied Mathematics 144(10), 1465–1477 (2006)
Green, T., Karvounarakis, G., Tannen, V.: Provenance semirings. In: PODS 2007, pp. 31–40 (2007)
Gurvich, V.A.: Criteria for repetition-freeness of functions in the algebra of logic. Soviet Math. Dokl. 43(3), 721–726 (1991)
Halpern, J., Pearl, J.: Causes and Explanations: A Structural-Model Approach–Part I: Causes. British Journal for Philosophy of Science 56(4), 843–887 (2005)
Hayes, J.: The fanout structure of switching functions. Journal of the ACM 22(4), 551–571 (1975)
Huang, J., Chen, T., Doan, A., Naughton, J.F.: On the provenance of non-answers to queries over extracted data. PVLDB 1(1), 736–747 (2008)
Jha, A., Suciu, D.: Knowledge compilation meets database theory: compiling queries to decision diagrams. In: ICDT 2011, pp. 162–173 (2011)
Kanagal, B., Li, J., Deshpande, A.: Sensitivity Analysis and Explanations for Robust Query Evaluation in Probabilistic Databases. In: SIGMOD 2010, pp. 675–686 (2010)
Karchmer, M., Linial, N., Newman, I., Saks, M., Wigderson, A.: Combinatorial characterization of read-once formulae. Discrete Mathematics 114(1-3), 275–282 (1993)
Lewis, D.: Causation. The Journal of Philosophy 70(17), 556–567 (1973)
Meliou, A., Gatterbauer, W., Halpern, J., Koch, C., Moore, K., Suciu, D.: Causality in databases. IEEE Data Engineering Bulletin 33(3), 59–67 (2010)
Meliou, A., Gatterbauer, W., Moore, K., Suciu, D.: WHY SO? or WHY NO? Functional Causality for Explaining Query Answers. In: MUD 2010, pp. 3–17 (2010)
Meliou, A., Gatterbauer, W., Suciu, D.: The complexity of causality and responsibility for query answer and non-answer. PVLDB 4(1), 34–45 (2011)
Menzies, P.: Counterfactual theories of Causation. Stanford Encylopedia of Philosophy (2008)
Olteanu, D., Huang, J., Koch, C.: SPROUT: Lazy vs. Eager Query Plans for Tuple-Independent Probabilistic databases. In: ICDE 2009, pp. 640–651 (2009)
Sen, P., Deshpande, A., Getoor, L.: Read-once functions and query evaluation in probabilistic databases. PVLDB 3(1), 1068–1079 (2010)
Stoer, M., Wagner, F.: A Simple Min-Cut Algorithm. Journal of the ACM 44(4), 585–591 (1997)
Tran, Q., Chan, C.: How to ConQueR why-not questions. In: SIGMOD 2010, pp. 15–26 (2010)
Valiant, L.: A theory of the learnable. Communications of the ACM 27(11), 1134–1142 (1984)
Widom, J.: Trio: A system for integrated management of data, accuracy, and lineage. In: ICDR 2005, pp. 262–276 (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Qin, B., Wang, S., Du, X. (2013). Efficient Responsibility Analysis for Query Answers. In: Meng, W., Feng, L., Bressan, S., Winiwarter, W., Song, W. (eds) Database Systems for Advanced Applications. DASFAA 2013. Lecture Notes in Computer Science, vol 7825. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-37487-6_20
Download citation
DOI: https://doi.org/10.1007/978-3-642-37487-6_20
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-37486-9
Online ISBN: 978-3-642-37487-6
eBook Packages: Computer ScienceComputer Science (R0)