Skip to main content

(Re)Use in Public Scientific Workflow Repositories

  • Conference paper
Book cover Scientific and Statistical Database Management (SSDBM 2012)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 7338))

Abstract

Scientific workflows help in designing, managing, monitoring, and executing in-silico experiments. Since scientific workflows often are complex, sharing them by means of public workflow repositories has become an important issue for the community. However, due to the increasing numbers of workflows available in such repositories, users have a crucial need for assistance in discovering the right workflow for a given task. To this end, identification of functional elements shared between workflows as a first step to derive meaningful similarity measures for workflows is a key point. In this paper, we present the results of a study we performed on the probably largest open workflow repository, myExperiment.org. Our contributions are threefold: (i) We discuss the critical problem of identifying same or similar (sub-)workflows and workflow elements, (ii) We study, for the first time, the problem of cross-author reuse and (iii) We provide a detailed analysis on the frequency of re-use of elements between workflows and authors, and identify characteristics of shared elements.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Oinn, T., Addis, M., Ferris, J., Marvin, D., Senger, M., Greenwood, R., Carver, K., Pocock, M.G., Wipat, A., Li, P.: Taverna: a tool for the composition and enactment of bioinformatics workflow. Bioinformatics 20(1), 3045–3054 (2003)

    Article  Google Scholar 

  2. Bowers, S., Ludäscher, B.: Actor-oriented design of scientific workflows. In: 24th Int. Conf. on Conceptual Modeling (2005)

    Google Scholar 

  3. Freire, J., Silva, C.T., Callahan, S.P., Santos, E., Scheidegger, C.E., Vo, H.T.: Managing Rapidly-Evolving Scientific Workflows. In: Moreau, L., Foster, I. (eds.) IPAW 2006. LNCS, vol. 4145, pp. 10–18. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  4. Goecks, J., Nekrutenko, A., Taylor, J.: Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences. Genome Biology 11, R86 (2010)

    Article  Google Scholar 

  5. Cohen-Boulakia, S., Leser, U.: Search, Adapt, and Reuse: The Future of Scientific Workflow Management Systems. SIGMOD Record 40(2) (2011)

    Google Scholar 

  6. Berners-Lee, T., Hendler, J.: Publishing on the Semantic Web. Nature, 1023–1025 (2001)

    Google Scholar 

  7. Roure, D.D., Goble, C.A., Stevens, R.: The design and realisation of the myexperiment virtual research environment for social sharing of workflows. Future Generation Computer Systems 25(5), 561–567 (2009)

    Article  Google Scholar 

  8. Mates, P., Santos, E., Freire, J., Silva, C.T.: CrowdLabs: Social Analysis and Visualization for the Sciences. In: Bayard Cushing, J., French, J., Bowers, S. (eds.) SSDBM 2011. LNCS, vol. 6809, pp. 555–564. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  9. Goderis, A., Sattler, U., Lord, P., Goble, C.A.: Seven Bottlenecks to Workflow Reuse and Repurposing. In: Gil, Y., Motta, E., Benjamins, V.R., Musen, M.A. (eds.) ISWC 2005. LNCS, vol. 3729, pp. 323–337. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  10. Xiang, X., Madley, G.: Improving the Reuse of Scientific Workflows and Their By-products. In: IEEE Int. Conf. on Web Services (2007)

    Google Scholar 

  11. Tversky, A.: Features of Similarity. Psychological Review 84, 327–352 (1977)

    Article  Google Scholar 

  12. Tan, W., Zhang, J., Foster, I.: Network Analysis of Scientific Workflows: a Gateway to Reuse. IEEE Computer 43(9), 54–61 (2010)

    Article  Google Scholar 

  13. Wassink, I., Vet, P.E.V.D., Wolstencroft, K., Neerincx, P.B.T., Roos, M., Rauwerda, H., Breit, T.M.: Analysing Scientific Workflows: Why Workflows Not Only Connect Web Services. In: IEEE Congress on Services (2009)

    Google Scholar 

  14. Stoyanovich, J., Taskar, B., Davidson, S.: Exploring repositories of scientific workflows. In: 1st Int. Workshop on Workflow Approaches to New Data-centric Science (2010)

    Google Scholar 

  15. Goderis, A., Li, P., Goble, C.: Workflow discovery: the problem, a case study from e-Science and a graph-based solution. In: IEEE Int. Conf. on Web Services (2006)

    Google Scholar 

  16. Zipf, G.: The Psycho-Biology of Language. MIT Press, Cambridge (1935)

    Google Scholar 

  17. Silva, V., Chirigati, F., Maia, K., Ogasawara, E., Oliveira, D., Braganholo, V., Murta, L., Mattoso, M.: Similarity-based Workflow Clustering. J. of Computational Interdisciplinary Science (2010)

    Google Scholar 

  18. Gil, Y., Kim, J., Florez, G., Ratnakar, V., Gonzalez-Calero, P.A.: Workflow matching using semantic metadata. In: 5th Int. Conf. on Knowledge Capture (2009)

    Google Scholar 

  19. Missier, P., Ludaescher, B., Dey, S., Wang, M., McPhillips, T., Bowers, S., Agun, M.: Golden-Trail: Retrieving the Data History that Matters from a Comprehensive Provenance Repository. In: 7th Int. Digital Curation Conf. (2011)

    Google Scholar 

  20. Salton, G., McGill, M. (eds.): Introduction to Modern Information Retrieval. McGraw-Hill (1983)

    Google Scholar 

  21. Zhang, J., Tan, W., Alexander, J., Foster, I., Madduri, R.: Recommend-As-You-Go: A Novel Approach Supporting Services-Oriented Scientific Workflow Reuse. In: IEEE Int. Conf. on Services Computing (2011)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Starlinger, J., Cohen-Boulakia, S., Leser, U. (2012). (Re)Use in Public Scientific Workflow Repositories. In: Ailamaki, A., Bowers, S. (eds) Scientific and Statistical Database Management. SSDBM 2012. Lecture Notes in Computer Science, vol 7338. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-31235-9_24

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-31235-9_24

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-31234-2

  • Online ISBN: 978-3-642-31235-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics