Abstract
Scientific workflows have become increasingly important for enabling and accelerating many scientific discoveries. More and more scientists and researchers rely on workflow systems to integrate and structure various local and remote heterogeneous data and services to perform in silico experiments. In order to support understanding, validation, and reproduction of scientific results, provenance querying and management has become a critical component in scientific workflows. In this paper, we propose a logic programming approach to scientific workflow provenance querying and management with the following contributions: i) We identify a set of characteristics that are desirable for a scientific workflow provenance query language; ii) Based on these requirements, we propose FLOQ, a Frame Logic based query language for scientific workflow provenance, iii) We demonstrate that our previous relational database based provenance model, virtual data schema, can be easily mapped to the FLOQ model; and iv) We show by examples that FLOQ is expressive enough to formulate common provenance queries, including all the provenance challenge queries proposed in the provenance challenge series.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Altintas, I., Barney, O., Jaeger-Frank, E.: Provenance collection support in the Kepler Scientific Workflow System. In: Moreau, L., Foster, I. (eds.) IPAW 2006. LNCS, vol. 4145, pp. 118–132. Springer, Heidelberg (2006)
Biton, O., Boulakia, S., Davidson, S., Hara, C.: Querying and Managing Provenance through User Views in Scientific Workflows. In: ICDE 2008 (2008)
Bose, R., Foster, I., Moreau, L.: Report on the international provenance and annotation workshop (IPAW 2006). SIGMOD Records (September 2006)
Buneman, P., Khanna, S., Tan, W.-C.: Why and Where: A Characterization of Data Provenance. In: International Conference on Database Theory (2001)
Chen, W., Kifer, M., Warren, D.S.: HiLog: A Foundation for Higher-Order Logic Programming. Journal of Logic Programming 15(3), 187–230 (1993)
Clifford, B., Foster, I., Voeckler, J., Wilde, M., Zhao, Y.: Tracking Provenance in a Virtual Data Grid. Journal of Concurrency and Computation, Practice and Experience (2007)
Costa, P., Rocha, R., Ferreira, M.: Tabling Logic Programs in a Database. In: Proceedings of the 21st Workshop on (Constraint) Logic Programming, WLP 2007 (2007)
Davidson, S., Boulakia, S., Eyal, A., Ludäscher, B., McPhillips, T., Bowers, S., Anand, M., Freire, J.: Provenance in Scientific Workflow Systems. IEEE Data Eng. Bull. 30(4), 44–50 (2007)
Davulcu, H., Kifer, M., Ramakrishnan, C.R., Ramakrishnan, I.V.: Logic based modeling and analysis of workflows. In: Proceedings of the Seventeenth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, June 01 - 04. PODS 1998, Seattle, Washington, United States (1998)
http://twiki.ipaw.info/bin/view/Challenge/FirstProvenanceChallenge (June 2006)
Foster, I., Voeckler, J., Wilde, M., Zhao, Y.: Chimera: A Virtual Data System for Representing, Querying, and Automating Data Derivation. In: 14th Conference on Scientific and Statistical Database Management (2002)
Freire, J., Silva, C.T., Callahan, S.P., Santos, E., Scheidegger, C.E., Vo, H.T.: Managing Rapidly-Evolving Scientific Workflows. In: Moreau, L., Foster, I. (eds.) IPAW 2006. LNCS, vol. 4145, pp. 10–18. Springer, Heidelberg (2006)
Groth, P., Miles, S., Tan, V., Moreau, L.: Architecture for Provenance Systems. Technical report, University of Southampton (October 2005)
Hajiyev, E., Verbaere, M., de Moor, O.: CodeQuest: Scalable Source Code Queries with Datalog. In: Thomas, D. (ed.) ECOOP 2006. LNCS, vol. 4067, pp. 2–27. Springer, Heidelberg (2006)
Kattenstroth, H., May, W., Schenk, F.: Combining OWL with F-Logic Rules and Defaults. In: International Workshop on Applications of Logic Programming to the Web, Semantic Web and Semantic Web Services (ALPSWS 2007) (2007)
Kifer, M., Lausen, G., Wu, J.: Logical Foundations of Object-Oriented and Frame-Based Languages. Journal of the ACM 42, 741–843 (1995)
Moreau, L., et al.: The First Provenance Challenge, Concurrency and Computation, Practice and Experience (2007)
Moreau, L., Zhao, Y., Foster, I., Voeckler, J., Wilde, M.: XDTM: XML Dataset Typing and Mapping for Specifying Datasets. In: European Grid Conference (2005)
Open Provenance Model (March 2008), http://twiki.ipaw.info/bin/view/OPM
Rao, P., Sagonas, K.F., Swift, T., Warren, D.S., Freire, J.: XSB: A System for Efficiently Computing Well-Founded Semantics. In: Fuhrbach, U., Dix, J., Nerode, A. (eds.) LPNMR 1997. LNCS, vol. 1265, pp. 2–17. Springer, Heidelberg (1997)
Simmhan, Y., Plale, B., Gannon, D.: A Performance Evaluation of the Karma Provenance Framework for Scientific Workflows. In: Moreau, L., Foster, I. (eds.) IPAW 2006. LNCS, vol. 4145, pp. 222–236. Springer, Heidelberg (2006)
Stevens, R., Zhao, J., Goble, C.: Using provenance to manage knowledge of In Silico experiments. Briefings in Bioinformatics 8(3), 183–194 (2007)
Terracina, G., Leone, N., Lio, V., Panetta, C.: Experimenting with recursive queries in database and logic programming systems, Theory and Practice of Logic Programming. Cambridge University Press, Cambridge (2007), doi:10.1017/S1471068407003158
Yang, G., Kifer, M., Zhao, C.: FLORA-2: A Rule-Based Knowledge Representation and Inference Infrastructure for the Semantic Web. In: Second International Conference on Ontologies, Databases and Applications of Semantics (ODBASE), Catania, Sicily, Italy (November 2003)
Zhao, Y., Hategan, M., Clifford, B., Foster, I., Laszewski, G.V., Raicu, I., Stef-Praun, T., Wilde, M.: Swift: Fast, Reliable, Loosely Coupled Parallel Computation. In: IEEE International Workshop on Scientific Workflows (SWF 2007). Collocated with SCC (2007)
Zhao, Y., Wilde, M., Foster, I.: Applying the Virtual Data Provenance Model. In: Moreau, L., Foster, I. (eds.) IPAW 2006. LNCS, vol. 4145, pp. 148–161. Springer, Heidelberg (2006)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Zhao, Y., Lu, S. (2008). A Logic Programming Approach to Scientific Workflow Provenance Querying. In: Freire, J., Koop, D., Moreau, L. (eds) Provenance and Annotation of Data and Processes. IPAW 2008. Lecture Notes in Computer Science, vol 5272. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-89965-5_5
Download citation
DOI: https://doi.org/10.1007/978-3-540-89965-5_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-89964-8
Online ISBN: 978-3-540-89965-5
eBook Packages: Computer ScienceComputer Science (R0)