Synonyms
History; Lineage; Origin; Pedigree; Source
Definition
Scientific databases contain data which may have been produced as answer to a query posed over other resources, or generated by in silico experiments (or scientific workflow) involving various softwares, or manually curated by domain experts based on analysis of several other resources. The provenance of a piece of data in scientific databases typically includes information of where this piece of data originates from, as well as details of the scientific process (e.g., parameters used in the experiments, software versions, etc.) by which it arrived in the scientific database.
Historical Background
Provenance of scientific databases has been studied in two granularities: workflow provenance and data provenance.
Workflow provenance (or coarse-grained provenance) refers to the record of the history (or workflow) of the derivation of some dataset in a scientific workflow [16721,16722,3]. The amount of information recorded for...
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Recommended Reading
Bose R, Frew J. Lineage retrieval for scientific data processing: a survey. ACM Comput Surv. 2005;37(1):1–28.
Moreau L, Ludäscher B, Altintas I, Barga RS, Bowers S, Callahan S, Chin G Jr, Clifford B, Cohen S, Cohen-Boulakia S, Davidson S, Deelman E, Digiampietri L, Foster I, Freire J, Frew J, Futrelle J, Gibson T, Gil Y, Goble C, Golbeck J, Groth P, Holland DA, Jiang S, Kim J, Koop D, Krenek A, McPhillips T, Mehta G, Miles S, Metzger D, Munroe S, Myers J, Plale B, Podhorszki N, Ratnakar V, Santos E, Scheidegger C, Schuchardt K, Seltzer M, Simmhan YL, Silva C, Slaughter P, Stephan E, Stevens R, Turi D, Vo H, Wilde M, Zhao J, Zhao Y. The first provenance challenge. Concurrency Comput Pract Exp. 2007;20(5):409–18. Special issue on the First Provenance Challenge.
Simmhan Y, Plale B, Gannon D. A survey of data provenance in e-science. ACM SIGMOD Rec. 2005;34:31–6.
Buneman P, Tan WC. Provenance in databases. In: Proceedings of the ACM SIGMOD International Conference on Management of Data; 2007. p. 1171–3.
Biton O, Cohen-Boulakia S, Davidson S. Zoom* UserViews: querying relevant provenance in workflow systems. In: Proceedings of the 33rd International Conference on Very Large Data Bases; 2007. p. 1366–9.
Biton O, Cohen-Boulakia S, Davidson S, Hara CS. Querying and managing provenance through user views in scientific workflows. In: Proceedings of the 24th International Conference on Data Engineering; 2008.
Buneman P, Khanna S, Tan WC. Why and where: a characterization of data provenance. In: Proceedings of the 8th International Conference on Database Theory; 2001. p. 316–30.
Cui Y, Widom J, Wiener JL. Tracing the lineage of view data in a warehousing environment. ACM Trans Database Syst. 2000;25(2):179–227.
Green TJ, Karvounarakis G, Tannen V. Provenance semirings. In: Proceedings of the 26th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems; 2007. p. 31–40.
Wang YR, Madnick SE. A polygen model for heterogeneous database systems: the source tagging perspective. In: Proceedings of the 16th International Conference on Very Large Data Bases; 1990. p. 519–38.
Buneman P, Khanna S, Tan WC. On propagation of deletions and annotations through views. In: Proceedings of the 21st ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems; 2002. p. 150–8.
Bhagwat D, Chiticariu L, Tan WC, Vijayvargiya G. An annotation management system for relational databases. VLDB J. 2005;14(4):373–96.
Buneman P, Chapman A, Cheney J. Provenance management in curated databases. In: Proceedings of the ACM SIGMOD International Conference on Management of Data; 2006. p. 539–50.
Benjelloun O, Sarma AD, Halevy AY, Widom J. ULDBs: databases with uncertainty and lineage. In: Proceedings of the 32nd International Conference on Very Large Data Bases; 2006. p. 953–64.
Chiticariu L, Tan WC. Debugging schema mappings with routes. In: Proceedings of the 32nd International Conference on Very Large Data Bases; 2006. p. 79–90.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Section Editor information
Rights and permissions
Copyright information
© 2018 Springer Science+Business Media, LLC, part of Springer Nature
About this entry
Cite this entry
Cohen-Boulakia, S., Tan, WC. (2018). Provenance in Scientific Databases. In: Liu, L., Özsu, M.T. (eds) Encyclopedia of Database Systems. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-8265-9_282
Download citation
DOI: https://doi.org/10.1007/978-1-4614-8265-9_282
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-8266-6
Online ISBN: 978-1-4614-8265-9
eBook Packages: Computer ScienceReference Module Computer Science and Engineering