Provenance in Scientific Databases
History; Lineage; Origin; Pedigree; Source
Scientific databases contain data which may have been produced as answer to a query posed over other resources, or generated by in silico experiments (or scientific workflow) involving various softwares, or manually curated by domain experts based on analysis of several other resources. The provenance of a piece of data in scientific databases typically includes information of where this piece of data originates from, as well as details of the scientific process (e.g., parameters used in the experiments, software versions, etc.) by which it arrived in the scientific database.
Provenance of scientific databases has been studied in two granularities: workflow provenance and data provenance.
Workflow provenance (or coarse-grained provenance) refers to the record of the history (or workflow) of the derivation of some dataset in a scientific workflow [1, 2, 3]. The amount of information recorded for...
- 2.Moreau L, Ludäscher B, Altintas I, Barga RS, Bowers S, Callahan S, Chin G Jr, Clifford B, Cohen S, Cohen-Boulakia S, Davidson S, Deelman E, Digiampietri L, Foster I, Freire J, Frew J, Futrelle J, Gibson T, Gil Y, Goble C, Golbeck J, Groth P, Holland DA, Jiang S, Kim J, Koop D, Krenek A, McPhillips T, Mehta G, Miles S, Metzger D, Munroe S, Myers J, Plale B, Podhorszki N, Ratnakar V, Santos E, Scheidegger C, Schuchardt K, Seltzer M, Simmhan YL, Silva C, Slaughter P, Stephan E, Stevens R, Turi D, Vo H, Wilde M, Zhao J, Zhao Y. The first provenance challenge. Concurrency Comput Pract Exp. 2007;20(5):409–18. Special issue on the First Provenance Challenge.CrossRefGoogle Scholar
- 4.Buneman P, Tan WC. Provenance in databases. In: Proceedings of the ACM SIGMOD International Conference on Management of Data; 2007. p. 1171–3.Google Scholar
- 5.Biton O, Cohen-Boulakia S, Davidson S. Zoom* UserViews: querying relevant provenance in workflow systems. In: Proceedings of the 33rd International Conference on Very Large Data Bases; 2007. p. 1366–9.Google Scholar
- 6.Biton O, Cohen-Boulakia S, Davidson S, Hara CS. Querying and managing provenance through user views in scientific workflows. In: Proceedings of the 24th International Conference on Data Engineering; 2008.Google Scholar
- 9.Green TJ, Karvounarakis G, Tannen V. Provenance semirings. In: Proceedings of the 26th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems; 2007. p. 31–40.Google Scholar
- 10.Wang YR, Madnick SE. A polygen model for heterogeneous database systems: the source tagging perspective. In: Proceedings of the 16th International Conference on Very Large Data Bases; 1990. p. 519–38.Google Scholar
- 11.Buneman P, Khanna S, Tan WC. On propagation of deletions and annotations through views. In: Proceedings of the 21st ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems; 2002. p. 150–8.Google Scholar
- 13.Buneman P, Chapman A, Cheney J. Provenance management in curated databases. In: Proceedings of the ACM SIGMOD International Conference on Management of Data; 2006. p. 539–50.Google Scholar
- 14.Benjelloun O, Sarma AD, Halevy AY, Widom J. ULDBs: databases with uncertainty and lineage. In: Proceedings of the 32nd International Conference on Very Large Data Bases; 2006. p. 953–64.Google Scholar
- 15.Chiticariu L, Tan WC. Debugging schema mappings with routes. In: Proceedings of the 32nd International Conference on Very Large Data Bases; 2006. p. 79–90.Google Scholar