Abstract
Scientific workflows may include automated decision steps, for instance to accept/reject certain data products during the course of an in silico experiment, based on an assessment of their quality. The trustworthiness of these workflows can be enhanced by providing the users with a trace and explanation of the outcome of these decisions. In this paper we present a provenance model that is designed specifically to support this task. The model applies to a particular type of sub-workflow that is compiled automatically from a high-level specification of user-defined, quality-based data acceptance criteria. The keys to the effectiveness of the approach are that (i) these sub-workflows follow a predictable pattern structure, (ii) the purpose of their component services is defined using an ontology of Information Quality concepts, and (iii) the conceptual model for provenance is consistent with the ontology structure.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Aebersold, R., Mann, M.: Mass spectrometry-based proteomics. Nature 422, 198–207 (2003)
Biton, O., Cohen-Boulakia, S., Davidson, S., Hara, C.: Querying and managing provenance through user views in scientific workflows. In: Procs. International Conference on Data Engineering (ICDE) (April 2008)
Callahan, S.P., Freire, J., Santos, E., Scheidegger, C.E.: VisTrails: visualization meets data management. In: SIGMOD Conference, pp. 745–747 (2006)
Chapman, A., Jagadish, H.V.: Issues in building practical provenance systems. IEEE Data Eng. Bull. 30(4), 38–43 (2007)
Davidson, S., Cohen-Boulakia, S., Eyal, A., Ludascher, B., McPhillips, T., Bowers, S., Kumar Anand, M., Freire, J.: Provenance in scientific workflow systems. Data Engineering Bulletin 30 (December 2007)
Hedeler, C., Missier, P.: Database Modeling in Biology: Practices and Challenges. In: Quality management challenges in the post-genomic era., Artech House (2007)
Hull, D., Wolstencroft, K., Stevens, R., Goble, C., Pocock, M.R., Li, P., Oinn, T.: Taverna: a tool for building and running workflows of services. Nucleic Acids Research 34, W729–W732 (2006)
Missier, P.: Modelling and Computing Information Quality in e-science. Ph.D thesis, School of Computer Science (2008)
Missier, P., Embury, S.M., Greenwood, M., Preece, A.D., Jin, B.: Quality views: Capturing and exploiting the user perspective on data quality. In: VLDB, Seoul, Korea, pp. 977–988 (September 2006)
Oinn, T., Addis, M., Ferris, J., Marvin, D., Senger, M., Greenwood, M., Carver, T., Glover, K., Pocock, M.R., Wipat, A., Li, P.: Taverna: A tool for the composition and enactment of bioinformatics workflows. Bioinformatics, 3045–3054 (November 2004)
Stead, D.A., Preece, A., Brown, A.J.P.: Universal metrics for quality assessment of protein identifications by mass spectrometry. Molecular & Cellular Proteomics 5(7), 1205–1211 (2006)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Missier, P., Embury, S., Stapenhurst, R. (2008). Exploiting Provenance to Make Sense of Automated Decisions in Scientific Workflows. In: Freire, J., Koop, D., Moreau, L. (eds) Provenance and Annotation of Data and Processes. IPAW 2008. Lecture Notes in Computer Science, vol 5272. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-89965-5_19
Download citation
DOI: https://doi.org/10.1007/978-3-540-89965-5_19
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-89964-8
Online ISBN: 978-3-540-89965-5
eBook Packages: Computer ScienceComputer Science (R0)