Abstract
As Open Data becomes commonplace, methods are needed to integrate disparate data from a variety of sources. Although Linked Data design has promise for integrating world wide data, integrators often struggle to provide appropriate transparency for their sources and transformations. Without this transparency, cautious consumers are unlikely to find enough information to allow them to trust third party content. While capturing provenance in RPI’s Linking Open Government Data project, we were faced with the common problem that only a portion of provenance that is captured is effectively used. Using our water quality portal’s use case as an example, we argue that one key to enabling provenance use is a better treatment of provenance granularity. To address this challenge, we have designed an approach that supports deriving abstracted provenance from granular provenance in an open environment. We describe the approach, show how it addresses the naturally occurring unmet provenance needs in a family of applications, and describe how the approach addresses similar problems in open provenance and open data environments.
Chapter PDF
References
Biton, O., Cohen-Boulakia, S., Davidson, S.B., Hara, C.S.: Querying and managing provenance through user views in scientific workflows. In: Proceedings of the 2008 IEEE 24th International Conference on Data Engineering, ICDE 2008, pp. 1072–1081. IEEE Computer Society, Washington, DC (2008)
Chapman, A., Jagadish, H.V.: Issues in building practical provenance systems. IEEE Data Eng. Bull. 30(4), 38–43 (2007)
Craglia, M., Almirall, P.G., Bergadà, M.M., Queraltó Ros, P.: The socio-economic impact of the spatial data infrastructure of catalonia. Institute for Environment and Sustainability, Joint Research Centre, European Commission (2008)
Ding, L., Peng, Y., Pinheiro da Silva, P., McGuinness, D.L.: Tracking RDF Graph Provenance using RDF Molecules. Technical report, UMBC (April 2005)
Erickson, J.S., Rozell, E., Shi, Y., Zheng, J., Ding, L., Hendler, J.A.: Twc international open government dataset catalog. In: Proceedings of the 7th International Conference on Semantic Systems, pp. 227–229. ACM (2011)
Garijo, D., Gil, Y.: A new approach for publishing workflows: abstractions, standards, and linked data. In: Proceedings of the 6th Workshop on Workflows in Support of Large-Scale Science, pp. 47–56. ACM (2011)
Gibson, T., Schuchardt, K., Stephan, E.: Application of named graphs towards custom provenance views. In: First Workshop on Theory and Practice of Provenance, TAPP 2009, pp. 5:1–5:5. USENIX Association, Berkeley (2009)
Graves, A.: A case study for integrating public safety data using semantic technologies. Information Polity 16(3), 261–275 (2011)
Hartung, C., Anokwa, Y., Brunette, W., Lerer, A., Tseng, C., Borriello, G.: Open data kit: Tools to build information services for developing regions. In: Proceedings of the International Conference on Information and Communication Technologies and Development, pp. 1–11 (2010)
Heath, T., Bizer, C.: Linked data: Evolving the web into a global data space. Synthesis Lectures on the Semantic Web: Theory and Technology 1(1), 1–136 (2011)
Ikeda, R., Widom, J.: Panda: A system for provenance and data. IEEE Data Eng. Bull. 33(3), 42–49 (2010)
McGuinness, D., Ding, L., Pinheiro Da Silva, P., Chang, C.: PML 2: A Modular Explanation Interlingua. In: Proceedings of the AAAI 2007 Workshop on Explanation Aware Computing, vol. 7, pp. 49–55. Knowledge Systems Laboratory, Stanford University (2007)
Moreau, L.: The Foundations for Provenance on the Web. Foundations and Trends in Web Science 2(2-3), 99–241 (2010)
Robinson, D., Yu, H., Zeller, W., Felten, E.: Government data and the invisible hand. Yale Journal of Law & Technology 11, 160 (2009)
Salayandia, L., Pinheiro, P., Gates, A.Q.: A framework to create ontologies for scientific data management. Technical Report UTEP-CS-12-03, University of Texas at El Paso, El Paso, TX (2012)
Stephan, E.G., Halter, T.D., Ermold, B.D.: Leveraging the Open Provenance Model as a Multi-tier Model for Global Climate Research. In: McGuinness, D.L., Michaelis, J.R., Moreau, L. (eds.) IPAW 2010. LNCS, vol. 6378, pp. 34–41. Springer, Heidelberg (2010)
Lebo, T., Erickson, J.S., Ding, L., Graves, A., Williams, G.T., DiFranzo, D., Li, X., Michaelis, J., Zheng, J.G., Flores, J., Shangguan, Z., McGuinness, D.L., Hendler, J.: Producing and Using Linked Open Government Data in the TWC LOGD Portal. In: Wood, D. (ed.) Linking Government Data. Springer (2011)
Wang, P., Zheng, J.G., Fu, L., Patton, E.W., Lebo, T., Ding, L., Liu, Q., Luciano, J.S., McGuinness, D.L.: A Semantic Portal for Next Generation Monitoring Systems. In: Aroyo, L., Welty, C., Alani, H., Taylor, J., Bernstein, A., Kagal, L., Noy, N., Blomqvist, E. (eds.) ISWC 2011, Part II. LNCS, vol. 7032, pp. 253–268. Springer, Heidelberg (2011)
Wilkinson, M.D., Vandervalk, B., McCarthy, L.: The Semantic Automated Discovery and Integration (SADI) Web service Design-Pattern, API and Reference Implementation. Journal of Biomedical Semantics 2(1), 8 (2011)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Lebo, T., Wang, P., Graves, A., McGuinness, D.L. (2012). Towards Unified Provenance Granularities. In: Groth, P., Frew, J. (eds) Provenance and Annotation of Data and Processes. IPAW 2012. Lecture Notes in Computer Science, vol 7525. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-34222-6_4
Download citation
DOI: https://doi.org/10.1007/978-3-642-34222-6_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-34221-9
Online ISBN: 978-3-642-34222-6
eBook Packages: Computer ScienceComputer Science (R0)