Skip to main content

A Provenance-Based Approach to Resource Discovery in Distributed Molecular Dynamics Workflows

  • Conference paper
Resource Discovery (RED 2009)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6162))

Included in the following conference series:

  • 303 Accesses

Abstract

The major challenge of in silico experiments consists in exploiting the amount of data generated by scientific apparatus. Scientific data, programs and workflows are resources to be exchanged among scientists but difficult to be efficiently used due to their heterogeneous and distributed nature. Provenance metadata can ease the discovery of these resources. However, keeping track of the execution of experiments and capturing provenance among distributed resources are not simple tasks. Thus, discovering scientific resources in distributed environments is still a challenge. This work presents an architecture to help the execution of scientific experiments in distributed environments. Additionally, it captures and stores provenance of the workflow execution in a repository. To validate the proposed architecture, a bioinformatics workflow has been defined for the execution of a real molecular dynamics simulation experiment, called GromDFlow. The experiment highlights the advantages of this architecture, which is available and is being used for several simulations.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Ayadi, N.Y., Lacroix, Z., Vidal, M.E.: BiOnMap: a deductive approach for resource discovery. In: 10th International Conference on Information Integration and Web-Based Applications & Services, pp. 477–482. ACM, New York (2008)

    Google Scholar 

  2. Azis, M., Lacroix, Z.: ProtocolDB: classifying resources with a domain ontology to support discovery. In: 10th International Conference on Information Integration and Web-Based Applications & Services, pp. 462–469. ACM, New York (2008)

    Google Scholar 

  3. Barbosa, L., Tandon, S., Freire, J.: Automatically Constructing a Directory of Molecular Biology Databases. In: Cohen-Boulakia, S., Tannen, V. (eds.) DILS 2007. LNCS (LNBI), vol. 4544, pp. 6–16. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  4. Berners-Lee, T., Fielding, R.T., Masinter, L.: Uniform Resource Identifier (URI): Generic syntax. RFC3986, The Internet Society (2005), http://tools.ietf.org/html/rfc3986

  5. Bitar, M., Santos, L.M.L., Bisch, P.M., Costa, M.G.S.: Molecular Modeling Studies of Nitrogenase’s Conformational Protection Mechanisms. In: Gluconacetobacter Diazotrophicus and Azotobacter Vinelandii. XXXVIII SBBq, São Paulo, Brazil (2009)

    Google Scholar 

  6. Braun, U., Shinnar, A., Seltzer, M.: Securing Provenance. In: 3rd Conference on Hot Topics in Security, pp. 1–5. Usenix Association, California (2008)

    Google Scholar 

  7. Callahan, S.P., Freire, J., Santos, E., Scheidegger, C.E., Silva, C.T., Vo, H.T.: VisTrails: Visualization Meets Data Management. In: ACM SIGMOD International Conference on Management of Data, pp. 745–747. ACM, New York (2006)

    Chapter  Google Scholar 

  8. Churches, D., Gombas, G., Harrison, A., Maassen, J., Robinson, C., Shields, M., Taylor, I., Wang, I.: Programming Scientific and Distributed Workflow with Triana Services. Concurr. Comput.: Pract. Exper. 18(10), 1021–1037 (2006)

    Article  Google Scholar 

  9. Cruz, S.M.S., Campos, M.L.M., Mattoso, M.: Towards a Taxonomy of Provenance in Scientific Workflow Management Systems. In: Congress on Services - I. SERVICES, pp. 259–266. IEEE Computer Society, Washington (2009)

    Chapter  Google Scholar 

  10. Cruz, S.M., Barros, P.M., Bisch, P.M., Campos, M.L., Mattoso, M.: Provenance Services for Distributed Workflows. In: 8th IEEE International Symposium on Cluster Computing and the Grid, pp. 526–533. IEEE Computer Society, Washington (2008)

    Google Scholar 

  11. Erl, T.: SOA: Principles of service design. Prentice Hall, Englewood Cliffs (2007)

    Google Scholar 

  12. Freire, J., Koop, D., Santos, E., Silva, C.T.: Provenance for Computational Tasks: A Survey. Computing in Science and Engineering 10(3), 11–21 (2008)

    Article  Google Scholar 

  13. GExp: Supporting Large Scale Management of Scientific Experiments, http://gexp.nacad.ufrj.br/

  14. Harrison, A., Taylor, I.: Enabling Desktop Workflow Applications. In: 4th Workshop on Workflows in Support of Large-Scale Science, pp. 1–9. ACM, New York (2009)

    Chapter  Google Scholar 

  15. Hey, T., Tansley, S., Toll, K. (eds.): The Fourth Paradigm: Data-Intensive Scientific Discovery, Microsoft Research (2009)

    Google Scholar 

  16. Lacroix, Z., Legendre, C.R.L., Tuzmen, S.: Congress on Services - I. SERVICES, pp. 306–313. IEEE Computer Society, Washington (2009)

    Book  Google Scholar 

  17. Lacroix, Z., Kothari, C.R., Mork, P., Wilkinson, M., Cohen-Boulakia, S.: Biological Metadata Management. In: Liu, L., Tamer Özsu, M. (eds.) Encyclopedia of Database Systems, pp. 215–219. Springer, US (2009)

    Google Scholar 

  18. Ludäscher, B., Altintas, I., Berkley, C., Higgins, D., Jaeger, E., Jones, M., Lee, E.A., Tao, J., Zhao, Y.: Scientific Workflow Management and the Kepler System. Concurr. Comput.: Pract. Exper. 18(10), 1039–1065 (2006)

    Article  Google Scholar 

  19. Moreau (Editor), L., Plale, B., Miles, S., Goble, C., Missier, P., Barga, R., Simmhan, Y., Futrelle, J., McGrath, R., Myers, J., Paulson, P., Bowers, S., Ludaescher, B., Kwasnikowska, N., Van den Bussche, J., Ellkvist, T., Freire, J. Groth, P.: The Open Provenance Model (v1.01), Technical Report, University of Southampton (2008)

    Google Scholar 

  20. myExperiment, http://www.myexperiment.org

  21. Protein Data Bank, http://www.pdb.org

  22. Romano, P., Bartocci, E., Bertolini, G., De Paoli, F., Marra, D., Mauri, G., Merelli, E., Milanesi, L.: Biowep: a workflow enactment portal for bioinformatics applications. BMC Bioinformatics 8(Suppl. 1), S19 (2007)

    Article  Google Scholar 

  23. Rossle, S.C.S., Carvalho, P.C., Dardenne, L.E., Bisch, P.M.: Development of a Computational Environment for Protein Structure Prediction and Functional Analysis. In: 2nd Brazilian Workshop on Bioinformatics, Macaé, RJ, Brazil, pp. 57–63 (2003)

    Google Scholar 

  24. Scheidegger, C., Koop, D., Santos, E., Vo, H., Callahan, S., Freire, J., Silva, C.: Tackling the Provenance Challenge One Layer at a Time. Concurr. Comput.: Pract. Exper. 20(5), 473–483 (2007)

    Article  Google Scholar 

  25. Simmhan, Y., Plale, B., Gannon, D.: A Survey of Data Provenance in e-Science. SIGMOD Record 34(3), 31–36 (2005)

    Article  Google Scholar 

  26. Srivastava, D., Velegrakis, Y.: Intentional Associations between Data and Metadata. In: ACM SIGMOD, pp. 401–412 (2007)

    Google Scholar 

  27. Tuffery, P., Lacroix, Z., Menager, H.: Semantic Map of Services for Structural Bioinformatics. In: 18th IEEE International Conference on Scientific and Statistical Database Management, pp. 217–224. IEEE Press, Los Alamitos (2006)

    Chapter  Google Scholar 

  28. UKOLN. Choosing a Metadata Standard for Resource Discovery, http://www.ukoln.ac.uk/qa-focus/documents/briefings/briefing-63/html

  29. Gromacs User Manual - Version 3.3, http://www.gromacs.org

  30. Scott, W.R.P., Hunenberger, P.H., Tironi, G., Mark, A.E., Billeter, S.R., Fennen, J., Torda, A.E., Huber, T., Kruger, P., van Gunsteren, W.F.: The GROMOS Biomolecular Simulation Program Package. J. Phys. Chem. A 103(19), 3596–3607 (1999)

    Article  Google Scholar 

  31. Wieczorek, M., Prodan, R., Fahringer, T.: Scheduling of Scientific Workflows in the Askalon Grid Environment. In: ACM SIGMOD, pp. 56–62 (2005)

    Google Scholar 

  32. Woollard, D., Medvidovic, N., Gil, Y., Mattmann, C.A.: Scientific Software as Workflows: From Discovery to Distribution. IEEE Software 4(25), 37–43 (2008)

    Article  Google Scholar 

  33. Zhao, Y., Hategan, M., Clifford, B., Foster, I., von Laszewski, G., Raicu, I., Stef-Praun, T., Wilde, M.: Swift: Fast, Reliable, Loosely Coupled Parallel Computation. In: Congress on Services - I. SERVICES, pp. 199–206. IEEE Computer Society, Washington (2007)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

da Cruz, S.M.S., Barros, P.M., Bisch, P.M., Campos, M.L.M., Mattoso, M. (2010). A Provenance-Based Approach to Resource Discovery in Distributed Molecular Dynamics Workflows. In: Lacroix, Z. (eds) Resource Discovery. RED 2009. Lecture Notes in Computer Science, vol 6162. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-14415-8_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-14415-8_5

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-14414-1

  • Online ISBN: 978-3-642-14415-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics