Advertisement

A Formal Model of Dataflow Repositories

  • Jan Hidders
  • Natalia Kwasnikowska
  • Jacek Sroka
  • Jerzy Tyszkiewicz
  • Jan Van den Bussche
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4544)

Abstract

Dataflow repositories are databases containing dataflows and their different runs. We propose a formal conceptual data model for such repositories. Our model includes careful formalisations of such features as complex data manipulation, external service calls, subdataflows, and the provenance of output values.

Keywords

Inference Rule Free Variable Service Function Function Assignment External Service 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Oinn, T., et al.: Taverna: A tool for the composition and enactment of bioinformatics workflows. Bioinformatics 20(17), 3045–3054 (2004)CrossRefGoogle Scholar
  2. 2.
    Ludäscher, B., et al.: Scientific workflow management and the Kepler system. Concurrency and Computation: Practice And Experience 18(10), 1039–1065 (2006)CrossRefGoogle Scholar
  3. 3.
    Buneman, P., Naqvi, S., Tannen, V., Wong, L.: Principles of programming with complex objects and collection types. Theor. Computer Science 149, 3–48 (1995)zbMATHCrossRefMathSciNetGoogle Scholar
  4. 4.
    Stevens, R., Goble, C., Baker, P., Brass, A.: A classification of tasks in bioinformatics. Bioinformatics 17(1), 180–188 (2001)CrossRefGoogle Scholar
  5. 5.
    Chen, J., Chung, S.-Y., Wong, L.: The Kleisli query system as a backbone for bioinformatics data integration and analysis. In: Bioinformatics: Managing Scientific Data, pp. 147–187. Morgan Kaufmann, San Francisco (2003)Google Scholar
  6. 6.
    Davidson, S., et al.: The Kleisli approach to data transformation and integration. In: The Functional Approach to Data Management, pp. 135–165. Springer, Heidelberg (2004)Google Scholar
  7. 7.
    Gambin, A., Hidders, J., Kwasnikowska, N., et al.: NRC as a formal model for expressing bioinformatics workflows. Poster at ISMB, Detroit, MI, USA (2005)Google Scholar
  8. 8.
    Pierce, B.: Types and Programming Languages. MIT Press, Cambridge (2002)Google Scholar
  9. 9.
    Ailamaki, A., Ioannidis, Y., Livny, M.: Scientific workflow management by database management. In: Proceedings of SSDBM, pp. 190–199. IEEE Computer Society, Los Alamitos (1998)Google Scholar
  10. 10.
    Chen, I., Markowitz, V.: An overview of the object protocol model (OPM) and the OPM data management tools. Information Systems 20(5), 393–418 (1995)CrossRefGoogle Scholar
  11. 11.
    Shankar, S., Kini, A., DeWitt, D., Naughton, J.: Integrating databases and workflow systems. SIGMOD Record 34(3), 5–11 (2005)CrossRefGoogle Scholar
  12. 12.
    Tröger, A., et al.: A language for comprehensively supporting the In Vitro experimental process. In: Silico Proceedings of BIBE, pp. 47–56. IEEE Computer Society, Los Alamitos (2004)Google Scholar
  13. 13.
    Zhao, Y., et al.: A notation and system for expressing and executing cleanly typed workflows on messy scientific data. SIGMOD Record 34(3), 37–43 (2005)CrossRefGoogle Scholar
  14. 14.
    Cohen, S., Cohen Boulakia, S., Davidson, S.: Towards a model of provenance and user views in scientific workflows. In: Leser, U., Naumann, F., Eckman, B. (eds.) DILS 2006. LNCS (LNBI), vol. 4075, pp. 264–279. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  15. 15.
    Bose, R., Frew, J.: Lineage retrieval for scientific data processing: A survey. ACM Computing Surveys 37(1), 1–28 (2005)CrossRefGoogle Scholar
  16. 16.
    Wong, S., Miles, S., Fang, W., et al.: Provenance-based validation of e-science experiments. In: Gil, Y., Motta, E., Benjamins, V.R., Musen, M.A. (eds.) ISWC 2005. LNCS, vol. 3729, pp. 801–815. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  17. 17.
    Mutsuzaki, M., et al.: Trio-One: Layering uncertainty and lineage on a conventional DBMS. In: Proceeding of CIDR Januari, Asilomar, California (2007)Google Scholar
  18. 18.
    Medeiros, C., et al.: WOODSS and the Web: annotating and reusing scientific workflows. SIGMOD Record 34(3), 18–23 (2005)CrossRefGoogle Scholar
  19. 19.
    McPhillips, T., et al.: Collection-oriented scientific workflows for integrating and analyzing biological data. In: Leser, U., Naumann, F., Eckman, B. (eds.) DILS 2006. LNCS (LNBI), vol. 4075, pp. 248–263. Springer, Heidelberg (2006)CrossRefGoogle Scholar

Copyright information

© Springer Berlin Heidelberg 2007

Authors and Affiliations

  • Jan Hidders
    • 1
  • Natalia Kwasnikowska
    • 2
  • Jacek Sroka
    • 3
  • Jerzy Tyszkiewicz
    • 3
  • Jan Van den Bussche
    • 2
  1. 1.University of AntwerpBelgium
  2. 2.Hasselt University and Transnational University of LimburgBelgium
  3. 3.Warsaw UniversityPoland

Personalised recommendations