Skip to main content

The Lifecycle of Provenance Metadata and Its Associated Challenges and Opportunities

  • Conference paper
  • First Online:
Building Trust in Information

Part of the book series: Springer Proceedings in Business and Economics ((SPBE))

Abstract

This chapter outlines some of the challenges and opportunities associated with adopting provenance principles (Cheney et al., Dagstuhl Reports 2(2):84–113, 2012) and standards (Moreau et al., Web Semant. Sci. Serv. Agents World Wide Web, 2015) in a variety of disciplines, including data publication and reuse, and information sciences.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Amsterdamer, Y., Davidson, S.B., Deutch, D., Milo, T., Stoyanovich, J., Tannen, V.: Putting lipstick on pig: enabling database-style workflow provenance. Proc. VLDB Endow. 5 (4), 346–357 (2011)

    Article  Google Scholar 

  2. Biton, O., Cohen-Boulakia, S., Davidson, S.B.: Zoom*UserViews: querying relevant provenance in workflow systems. In: VLDB, pp. 1366–1369 (2007)

    Google Scholar 

  3. Cadenhead, T., Khadilkar, V., Kantarcioglu, M., Thuraisingham, B.: Transforming provenance using redaction. In: Proceedings of the 16th ACM Symposium on Access Control Models and Technologies, SACMAT ’11, pp. 93–102. ACM, New York (2011)

    Google Scholar 

  4. Cheney, J., Chiticariu, L., Tan, W.-C.: Provenance in databases: why, how, and where. Found. Trends Databases 1, 379–474 (2009)

    Article  Google Scholar 

  5. Cheney, J., Missier, P., Moreau, L.: Constraints of the provenance data model. Technical Report (2012)

    Google Scholar 

  6. Cheney, J., Finkelstein, A., Ludaescher, B., Vansummeren, S.: Principles of provenance (Dagstuhl Seminar 12091). Dagstuhl Reports 2 (2), 84–113 (2012)

    Google Scholar 

  7. Cohen-Boulakia, S., Leser, U.: Search, adapt, and reuse: the future of scientific workflows. SIGMOD Rec. 40 (2), 6–16 (2011)

    Article  Google Scholar 

  8. Davidson, S., Freire, J.: Provenance and scientific workflows: challenges and opportunities. In: Proceedings of SIGMOD Conference, Tutorial, pp. 1345–1350 (2008)

    Google Scholar 

  9. Davidson, S., Cohen-Boulakia, S., Eyal, A., Ludäscher, B., McPhillips, T., Bowers, S., Anand, M.K., Freire, J.: Provenance in scientific workflow systems. In: Data Engineering Bulletin, vol. 30. IEEE, New York (2007)

    Google Scholar 

  10. Dey, S., Zinn, D., Ludäscher, B.: ProPub: towards a declarative approach for publishing customized, policy-aware provenance. In: Cushing, J.B., French, J., Bowers, S. (Eds.), Scientific and Statistical Database Management. Lecture Notes in Computer Science, vol. 6809, pp. 225–243. Springer, Berlin, Heidelberg (2011)

    Chapter  Google Scholar 

  11. Firth, H., Missier, P.: ProvGen: generating synthetic PROV graphs with predictable structure. In: Proceedings of IPAW 2014 (Provenance and Annotations), Koln (2014)

    Google Scholar 

  12. Ghoshal, D., Plale, B.: Provenance from log files: a bigdata problem. In: Proceedings of BigProv Workshop on Managing and Querying Provenance at Scale (2013)

    Book  Google Scholar 

  13. Green, T.J., Karvounarakis, G., Tannen, V.: Provenance semirings. In: PODS, pp. 31–40 (2007)

    Google Scholar 

  14. Hiden, H., Watson, P., Woodman, S., Leahy, D.: e-Science central: cloud-based e-Science and its application to chemical property modelling. Technical Report cs-tr-1227. School of Computing Science, Newcastle University (2011)

    Google Scholar 

  15. Hull, D., Wolstencroft, K., Stevens, R., Goble, C.A., Pocock, M.R., Li, P., Oinn, T.: Taverna: a tool for building and running workflows of services. Nucleic Acids Res. 34, 729–732 (2006)

    Article  Google Scholar 

  16. Katz, D.S.: Transitive credit as a means to address social and technological concerns stemming from citation and attribution of digital products. J. Open Res. Soft. 2 (1), e20 (2014)

    Article  Google Scholar 

  17. Kratz, J.E., Strasser, C.: Making data count. Nature Scientific Data 2, 150039 (2015)

    Article  Google Scholar 

  18. Lebo, T., Sahoo, S., McGuinness, D., Belhajjame, K., Cheney, J., Corsar, D., Garijo, D., Soiland-Reyes, S., Zednik, S., Zhao, J.: PROV-O: The PROV ontology. Technical Report (2012)

    Google Scholar 

  19. Lerner, B.S., Boose, E.R.: Collecting provenance in an interactive scripting environment. In: Proceedings of TAPP’14 (2014)

    Google Scholar 

  20. Lerner, B., Boose, E.: RDataTracker: collecting provenance in an interactive scripting environment. In: 6th USENIX Workshop on the Theory and Practice of Provenance (TaPP 2014) (2014)

    Google Scholar 

  21. Lim, C., Lu, S., Chebotko, A., Fotouhi, F.: Prospective and retrospective provenance collection in scientific workflow environments. In: 2010 IEEE International Conference on Services Computing (SCC), pp. 449–456 (2010)

    Google Scholar 

  22. Lyle, J., Martin, A.: Trusted computing and provenance: better together. In: Proceedings of the 2nd Conference on Theory and Practice of Provenance, TAPP’10, Berkeley, CA, p. 1. USENIX Association, Berkeley, CA (2010)

    Google Scholar 

  23. Macko, P., Chiarini, M., Seltzer, M.: Collecting provenance via the Xen hypervisor. In: Freire, J., Buneman, P. (eds.) TAPP Workshop, Heraklion (2011)

    Google Scholar 

  24. Missier, P., Paton, N., Belhajjame, K.: Fine-grained and efficient lineage querying of collection-based workflow provenance. In: Proceedings of EDBT, Lausanne, Switzerland (2010)

    Book  Google Scholar 

  25. Missier, P., Sahoo, S.S., Zhao, J., Sheth, A., Goble, C.: Janus: from workflows to semantic provenance and linked open data. In: Proceedings of IPAW 2010, Troy, NY (2010)

    Google Scholar 

  26. Missier, P., Soiland-Reyes, S., Owen, S., Tan, W., Nenadic, A., Dunlop, I., Williams, A., Oinn, T., Goble, C.: Taverna, reloaded. In: Gertz, M., Hey, T., Ludaescher, B. (eds.) Proceedings of SSDBM 2010, Heidelberg (2010)

    Google Scholar 

  27. Missier, P., Dey, S., Belhajjame, K., Cuevas, V., Ludaescher, B.: D-PROV: extending the PROV provenance model with workflow structure. In: Proceedings of TAPP’13, Lombard, IL (2013)

    Google Scholar 

  28. Missier, P., Woodman, S., Hiden, H., Watson, P.: Provenance and data differencing for workflow reproducibility analysis. Concurr. Comput. 28 (4), 995–1015 (2016)

    Article  Google Scholar 

  29. Missier, P., Bryans, J., Gamble, C., Curcin, V., Danger, R.: ProvAbs: model, policy, and tooling for abstracting PROV graphs. In: Proceedings of IPAW 2014 (Provenance and Annotations), Koln. Springer, Berlin (2014)

    Google Scholar 

  30. Mitchell, C., Mitchell, C., Mitchell, C.: Trusted computing. In: Chen, L., Mitchell, C.J., Martin, A. (eds.) Proceedings of Trust 2009, Oxford. Springer, Berlin (2005)

    Google Scholar 

  31. Moreau, L., Ludäscher, B., Altintas, I., Barga, R.S.: The first provenance challenge. Concurr. Comput. 20, 409–418 (2008)

    Article  Google Scholar 

  32. Moreau, L., Clifford, B., Freire, J., Futrelle, J., Gil, Y., Groth, P., Kwasnikowska, N., Miles, S., Missier, P., Myers, J., Plale, B., Simmhan, Y., Stephan, E., Van Den Bussche, J.: The open provenance model—core specification (v1.1). Futur. Gener. Comput. Syst. 7 (21), 743–756 (2011)

    Google Scholar 

  33. Moreau, L., Hartig, O., Simmhan, Y., Myers, J., Lebo, T., Belhajjame, K., Miles, S.: PROV-AQ: provenance access and query. Technical Report (2012)

    Google Scholar 

  34. Moreau, L., Missier, P., Belhajjame, K., B’Far, R., Cheney, J., Coppens, S., Cresswell, S., Gil, Y., Groth, P., Klyne, G., Lebo, T., McCusker, J., Miles, S., Myers, J., Sahoo, S., Tilmes, C.: PROV-DM: the PROV data model. Technical Report. World Wide Web Consortium (2012)

    Google Scholar 

  35. Moreau, L., Missier, P., Cheney, J., Soiland-Reyes, S.: PROV-N: the provenance notation. Technical Report (2012)

    Google Scholar 

  36. Moreau, L., Groth, P., Cheney, J., Lebo, T., Miles, S.: The rationale of PROV. Web Semant. Sci. Serv. Agents World Wide Web 35, Part 4, 235–257 (2015)

    Google Scholar 

  37. Murta, L., Braganholo, V., Chirigati, F., Koop, D., Freire, J.: noWorkflow: capturing and analyzing provenance of scripts. In: Proceedings of IPAW’14 (2014)

    Google Scholar 

  38. PROV DC (2013). Available at http://www.w3.org/TR/prov-dc/

  39. PROV Dictionary (2013). Available at http://www.w3.org/TR/prov-dictionary/

    Google Scholar 

  40. PROV-Overview: An Overview of the PROV Family of Documents. Technical Report (2012)

    Google Scholar 

  41. PROV-XML (2013). Available at http://www.w3.org/TR/prov-xml/

  42. Special Issue on Provenance, Data and Information Quality. J. Data Inf. Qual. 5 (3) (2015)

    Google Scholar 

  43. The Provenance Incubator Group Charter (2009). Available at http://www.w3.org/2005/Incubator/prov/charter

  44. The Provenance Incubator Group Final Report (2010). Available at http://www.w3.org/2005/Incubator/prov/XGR-prov-20101214/

  45. The ProvONE provenance model (2014). Available at http://tinyurl.com/ProvONE

  46. Woodman, S., Hiden, H., Watson, P.: Workflow provenance: an analysis of long term storage costs. In: Proceedings of 10th WORKS workshop, Austin, TX (2015)

    Google Scholar 

  47. Zhang, J., Chapman, A., LeFevre, K.: Do you know where your datas been? tamper-evident database provenance. In: Jonker, W., Petkovic, M. (eds.) Secure Data Management. Lecture Notes in Computer Science, vol. 5776, pp. 17–32. Springer, Berlin/Heidelberg (2009)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Paolo Missier .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Missier, P. (2016). The Lifecycle of Provenance Metadata and Its Associated Challenges and Opportunities. In: Lemieux, V. (eds) Building Trust in Information. Springer Proceedings in Business and Economics. Springer, Cham. https://doi.org/10.1007/978-3-319-40226-0_8

Download citation

Publish with us

Policies and ethics