Abstract
In many application domains the provenance of data plays an important role. It is often required to get store detailed information of the underlying processes that led to the data (e.g., results of numerical simulations) for the purpose of documentation or checking the process for compliance to applicable regulations. Especially in science and engineering more and more applications are being developed in Python, which is used either for development of the whole application or as a glue language for coordinating codes written in other programming languages. To easily integrate provenance recording into applications developed in Python, a provenance client library with a suitable Python API is useful. In this paper we present such a Python client library for recording and querying provenance information. We show an exemplary application, explain the overall architecture of the library, and give some details on the technologies used for the implementation.
Chapter PDF
Similar content being viewed by others
Keywords
- Application Developer
- Globus Toolkit
- Provenance Information
- Soap Message
- Object Oriented Programming Language
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Moreau, L., Groth, P., Miles, S., Vazquez-Salceda, J., Ibbotson, J., Jiang, S., Munroe, S., Rana, O., Schreiber, A., Tan, V., Varga, L.: The provenance of electronic data. Commun. ACM 51(4), 52–58 (2008)
Groth, P., Jiang, S., Miles, S., Munroe, S., Tan, V., Tsasakou, S., Moreau, L.: An Architecture for Provenance Systems. Technical report, University of Southampton (2006)
The Python Website, http://www.python.org
The AeroGrid Project Website, http://www.aero-grid.de
Schlauch, T., Schreiber, A.: Datafinder - a scientific data management solution. In: Ensuring the Long-Term Preservation and Value Adding to Scientific and Technical Data, PV 2007, Oberpfaffenhofen, Germany (2007)
Dubois, P.F.: Ten good practices in scientific programming. Computing in Science and Engg. 1(1), 7–11 (1999)
Jackson, K.R.: PyGlobus: a Python interface to the Globus Toolkit. Concurrency and Computation: Practice and Experience 14(13-15), 1075–1083 (2002)
The EU Grid Provenance Website, http://www.gridprovenance.org
Miles, S., Moreau, L., Groth, P., Tan, V., Munroe, S., Jiang, S.: Provenance Query Protocol. Technical report, University of Southampton (2006)
Jiang, S.: Client side library. Architecture tutorial. Technical report, University of Southampton (2005)
Groth, P., Tan, V., Munroe, S., Jiang, S., Miles, S., Moreau, L.: Process Documentation Recording Protocol. Technical report, University of Southampton (2006)
Miles, S., Moreau, L., Groth, P., Tan, V., Munroe, S., Jiang, S.: XPath Profile for the Provenance Query Protocol. Technical report, University of Southampton (2006)
Munroe, S., Tan, V., Groth, P., Jiang, S., Miles, S., Moreau, L.: A SOAP Binding For Process Documentation. Technical report, University of Southampton (2006)
The PReServ Website, http://twiki.pasoa.ecs.soton.ac.uk/bin/view/PASOA/SoftWare
Jiang, S., Moreau, L., Groth, P., Miles, S., Munroe, S., Tan, V.: Client Side Library Design and Implementation. Technical report, University of Southampton (2006)
The Python Webservices Project Website (including ZSI), http://pywebsvcs.sourceforge.net
The Python Enterprise Application Kit (PEAK) Website, http://peak.telecommunity.com
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Bochner, C., Gude, R., Schreiber, A. (2008). A Python Library for Provenance Recording and Querying. In: Freire, J., Koop, D., Moreau, L. (eds) Provenance and Annotation of Data and Processes. IPAW 2008. Lecture Notes in Computer Science, vol 5272. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-89965-5_24
Download citation
DOI: https://doi.org/10.1007/978-3-540-89965-5_24
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-89964-8
Online ISBN: 978-3-540-89965-5
eBook Packages: Computer ScienceComputer Science (R0)