Skip to main content

A Graph Testing Framework for Provenance Network Analytics

  • Conference paper
  • First Online:
  • 759 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11017))

Abstract

Provenance Network Analytics is a method of analyzing provenance that assesses a collection of provenance graphs by training a machine learning algorithm to make predictions about the characteristics of data artifacts based on their provenance graph metrics. The shape of a provenance graph can vary according the modelling approach chosen by data analysts, and this is likely to affect the accuracy of machine learning algorithms, so we propose a framework for capturing provenance using semantic web technologies to allow use of multiple provenance models at runtime in order to test their effects.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Yue, P., Zhang, M., Guo, X., Tan Z.: Granularity of geospatial data provenance. In: 2014 IEEE Geoscience and Remote Sensing Symposium, pp. 4492–4495 (2014)

    Google Scholar 

  2. Maso, J., Pross, B., Gil, Y., Closa, G. (eds.): Testbed 10 Provenance Engineering Report. OGC, 14 July 2014

    Google Scholar 

  3. Yue, P., Gong, J., Di, L., He, L., Wei, Y.: Semantic provenance registration and discovery using geospatial catalogue service. In: Proceedings 2nd International Workshop on the Role of Semantic Web in Provenance Management, Shanghai, China, pp. 23–28 (2010)

    Google Scholar 

  4. Oliveira, W., Ambrósio, L.M., Braga, R., Ströele, V., David, J.M., Campos, F.: A framework for provenance analysis and visualization. Procedia Comput. Sci. 108, 1592–1601 (2017)

    Article  Google Scholar 

  5. Acar, U., Buneman, P., Cheney J.: A graph model of data and workflow provenance, p. 10 (2010)

    Google Scholar 

  6. Miles, S., Groth, P., Branco, M., Moreau, L.: The requirements of recording and using provenance in e-science experiments, p. 15 (2007)

    Google Scholar 

  7. Davidson, S., et al.: Provenance in scientific workflow systems, p. 7 (2007)

    Google Scholar 

  8. Moreau, L.: Aggregation by provenance types: a technique for summarising provenance graphs. In: Electronic Proceedings in Theoretical Computer Science, vol. 181, pp. 129–144, April 2015

    Google Scholar 

  9. Macko, P., Seltzer, M.: Provenance map orbiter: interactive exploration of large provenance graphs, p. 6 (2011)

    Google Scholar 

  10. Huynh, T.D., Ebden, M., Venanzi, M., Ramchurn, S.D., Roberts, S., Moreau, L.: Interpretation of crowdsourced activities using provenance network analysis. In: First AAAI Conference on Human Computation and Crowdsourcing (2013)

    Google Scholar 

  11. Huynh, T.D., Ebden, M., Fischer, J., Roberts, S., Moreau, L.: Provenance network analytics: an approach to data analytics using data provenance. In: Data Mining and Knowledge Discovery, February 2018

    Google Scholar 

  12. Ebden, M., Huynh, T.D., Moreau, L., Ramchurn, S., Roberts, S.: Network analysis on provenance graphs from a crowdsourcing application. In: Groth, P., Frew, J. (eds.) IPAW 2012. LNCS, vol. 7525, pp. 168–182. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-34222-6_13

    Chapter  Google Scholar 

  13. Newman, M.E.J.: Networks: An Introduction. Oxford University Press, Oxford/New York (2010)

    Book  Google Scholar 

  14. Roper, B.: Investigating the role of data provenance in assessing variations in the quality of open street map data, MSc, University of Southampton (2017)

    Google Scholar 

  15. Missier, P., Bryans, J., Gamble, C., Curcin, V., Danger, R.: ProvAbs: Model, policy, and tooling for abstracting PROV graphs. In: Ludäscher, B., Plale, B. (eds.) IPAW 2014. LNCS, vol. 8628, pp. 3–15. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-16462-5_1

    Chapter  Google Scholar 

  16. Pasquier, T., et al.: Practical Whole-System Provenance Capture, pp. 405–418 (2017). arXiv:1711.05296 [cs]

  17. De Nies, T., et al.: Git2PROV: exposing version control system content as W3C PROV. In: Poster and Demo Proceedings of the 12th International Semantic Web Conference, vol. 1035, pp. 125–128 (2013)

    Google Scholar 

  18. Ghoshal, D., Plale, B.: Provenance from log files: a BigData problem, p. 290 (2013)

    Google Scholar 

  19. Moreau, L., Groth, P.: Provenance: An Introduction to PROV. Morgan & Claypool Publishers, San Rafael (2013)

    Google Scholar 

  20. protégé. https://protege.stanford.edu/. Accessed 07 April 2018

  21. Pedregosa, F., et al.: Scikit-learn: machine learning in python. J. Mach. Learn. Res. 12(Oct), 2825–2830 (2011)

    MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bernard Roper .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Roper, B., Chapman, A., Martin, D., Morley, J. (2018). A Graph Testing Framework for Provenance Network Analytics. In: Belhajjame, K., Gehani, A., Alper, P. (eds) Provenance and Annotation of Data and Processes. IPAW 2018. Lecture Notes in Computer Science(), vol 11017. Springer, Cham. https://doi.org/10.1007/978-3-319-98379-0_29

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-98379-0_29

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-98378-3

  • Online ISBN: 978-3-319-98379-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics