Provenance in Databases
History; Lineage; Origin; Pedigree; Source
Let t be a data element in the result of a query Q applied to a dataset D. The provenance of t is the set of all proofs for t according to Q and D. A proof for t according to Q and D is a subset D′ of data elements in D so that t is in the result of applying Q on D′. In some cases, a proof also details the process by which t is derived from Q and D′.
Most work on provenance in databases focused on finding minimal subsets of D that witness the existence of t in the result, as well as which parts of D are t copied from. More general forms of provenance based on annotations (e.g., elements of algebraic structures such as semirings) have also been investigated. Provenance is also important for understanding how data in databases has evolved as a result of updates over time, particularly in curated scientific databases.
Data provenance (or fine-grained provenance) is an account of the derivation of a piece...
- 1.Arab B, Gawlick D, Radhakrishnan V, Guo H, Glavic B. A generic provenance middleware for database queries, updates, and transactions. In: Proceedings of the 6th USENIX Workshop on the Theory and Practice of Provenance; 2014.Google Scholar
- 5.Buneman P, Chapman A, Cheney J. Provenance management in curated databases. In: Proceedings of the ACM SIGMOD International Conference on Management of Data; 2006. p. 539–50.Google Scholar
- 7.Buneman P, Khanna S, Tan W-C. On propagation of deletions and annotations through views. In: Proceedings of the 21st ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems; 2002. p. 150–8.Google Scholar
- 8.Buneman P, Tan W-C. Provenance in databases. In: Proceedings of the ACM SIGMOD International Conference on Management of Data; 2007. p. 1171–3. (Tutorial Track).Google Scholar
- 10.Chiticariu L, Tan W-C. Debugging schema mappings with routes. In: Proceedings of the 32nd International Conference on Very Large Data Bases; 2006. p. 79–90.Google Scholar
- 12.Das Sarma A, Theobald M, Widom J. LIVE: a lineage-supported versioned DBMS. In: Proceedings of the 22nd International Conference on. Scientific and Statistical Database Management; 2010.Google Scholar
- 13.Fegaras L. Propagating updates through XML views using lineage tracing. In: Proceedings of the 26th International Conference on Data Engineering; 2010. p. 309–20.Google Scholar
- 14.Glavic B, Alonso G. Perm: processing provenance and data on the same data model through query rewriting. In: Proceedings of the 25th International Conference on Data Engineering; 2009.Google Scholar
- 16.Green TJ, Karvounarakis G, Tannen V. Provenance semirings. In: Proceedings of the 26th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems; 2007.Google Scholar
- 18.Karvounarakis G, Ives ZG, Tannen V. Querying data provenance. In: Proceedings of the ACM SIGMOD International Conference on Management of Data; 2010.Google Scholar
- 19.Wang Y, Madnick SE. A polygen model for heterogeneous database systems: the source tagging perspective. In: Proceedings of the 16th International Conference on Very Large Data Bases; 1990. p. 519–38.Google Scholar