Abstract
The major challenge of in silico experiments consists in exploiting the amount of data generated by scientific apparatus. Scientific data, programs and workflows are resources to be exchanged among scientists but difficult to be efficiently used due to their heterogeneous and distributed nature. Provenance metadata can ease the discovery of these resources. However, keeping track of the execution of experiments and capturing provenance among distributed resources are not simple tasks. Thus, discovering scientific resources in distributed environments is still a challenge. This work presents an architecture to help the execution of scientific experiments in distributed environments. Additionally, it captures and stores provenance of the workflow execution in a repository. To validate the proposed architecture, a bioinformatics workflow has been defined for the execution of a real molecular dynamics simulation experiment, called GromDFlow. The experiment highlights the advantages of this architecture, which is available and is being used for several simulations.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Ayadi, N.Y., Lacroix, Z., Vidal, M.E.: BiOnMap: a deductive approach for resource discovery. In: 10th International Conference on Information Integration and Web-Based Applications & Services, pp. 477–482. ACM, New York (2008)
Azis, M., Lacroix, Z.: ProtocolDB: classifying resources with a domain ontology to support discovery. In: 10th International Conference on Information Integration and Web-Based Applications & Services, pp. 462–469. ACM, New York (2008)
Barbosa, L., Tandon, S., Freire, J.: Automatically Constructing a Directory of Molecular Biology Databases. In: Cohen-Boulakia, S., Tannen, V. (eds.) DILS 2007. LNCS (LNBI), vol. 4544, pp. 6–16. Springer, Heidelberg (2007)
Berners-Lee, T., Fielding, R.T., Masinter, L.: Uniform Resource Identifier (URI): Generic syntax. RFC3986, The Internet Society (2005), http://tools.ietf.org/html/rfc3986
Bitar, M., Santos, L.M.L., Bisch, P.M., Costa, M.G.S.: Molecular Modeling Studies of Nitrogenase’s Conformational Protection Mechanisms. In: Gluconacetobacter Diazotrophicus and Azotobacter Vinelandii. XXXVIII SBBq, São Paulo, Brazil (2009)
Braun, U., Shinnar, A., Seltzer, M.: Securing Provenance. In: 3rd Conference on Hot Topics in Security, pp. 1–5. Usenix Association, California (2008)
Callahan, S.P., Freire, J., Santos, E., Scheidegger, C.E., Silva, C.T., Vo, H.T.: VisTrails: Visualization Meets Data Management. In: ACM SIGMOD International Conference on Management of Data, pp. 745–747. ACM, New York (2006)
Churches, D., Gombas, G., Harrison, A., Maassen, J., Robinson, C., Shields, M., Taylor, I., Wang, I.: Programming Scientific and Distributed Workflow with Triana Services. Concurr. Comput.: Pract. Exper. 18(10), 1021–1037 (2006)
Cruz, S.M.S., Campos, M.L.M., Mattoso, M.: Towards a Taxonomy of Provenance in Scientific Workflow Management Systems. In: Congress on Services - I. SERVICES, pp. 259–266. IEEE Computer Society, Washington (2009)
Cruz, S.M., Barros, P.M., Bisch, P.M., Campos, M.L., Mattoso, M.: Provenance Services for Distributed Workflows. In: 8th IEEE International Symposium on Cluster Computing and the Grid, pp. 526–533. IEEE Computer Society, Washington (2008)
Erl, T.: SOA: Principles of service design. Prentice Hall, Englewood Cliffs (2007)
Freire, J., Koop, D., Santos, E., Silva, C.T.: Provenance for Computational Tasks: A Survey. Computing in Science and Engineering 10(3), 11–21 (2008)
GExp: Supporting Large Scale Management of Scientific Experiments, http://gexp.nacad.ufrj.br/
Harrison, A., Taylor, I.: Enabling Desktop Workflow Applications. In: 4th Workshop on Workflows in Support of Large-Scale Science, pp. 1–9. ACM, New York (2009)
Hey, T., Tansley, S., Toll, K. (eds.): The Fourth Paradigm: Data-Intensive Scientific Discovery, Microsoft Research (2009)
Lacroix, Z., Legendre, C.R.L., Tuzmen, S.: Congress on Services - I. SERVICES, pp. 306–313. IEEE Computer Society, Washington (2009)
Lacroix, Z., Kothari, C.R., Mork, P., Wilkinson, M., Cohen-Boulakia, S.: Biological Metadata Management. In: Liu, L., Tamer Özsu, M. (eds.) Encyclopedia of Database Systems, pp. 215–219. Springer, US (2009)
Ludäscher, B., Altintas, I., Berkley, C., Higgins, D., Jaeger, E., Jones, M., Lee, E.A., Tao, J., Zhao, Y.: Scientific Workflow Management and the Kepler System. Concurr. Comput.: Pract. Exper. 18(10), 1039–1065 (2006)
Moreau (Editor), L., Plale, B., Miles, S., Goble, C., Missier, P., Barga, R., Simmhan, Y., Futrelle, J., McGrath, R., Myers, J., Paulson, P., Bowers, S., Ludaescher, B., Kwasnikowska, N., Van den Bussche, J., Ellkvist, T., Freire, J. Groth, P.: The Open Provenance Model (v1.01), Technical Report, University of Southampton (2008)
myExperiment, http://www.myexperiment.org
Protein Data Bank, http://www.pdb.org
Romano, P., Bartocci, E., Bertolini, G., De Paoli, F., Marra, D., Mauri, G., Merelli, E., Milanesi, L.: Biowep: a workflow enactment portal for bioinformatics applications. BMC Bioinformatics 8(Suppl. 1), S19 (2007)
Rossle, S.C.S., Carvalho, P.C., Dardenne, L.E., Bisch, P.M.: Development of a Computational Environment for Protein Structure Prediction and Functional Analysis. In: 2nd Brazilian Workshop on Bioinformatics, Macaé, RJ, Brazil, pp. 57–63 (2003)
Scheidegger, C., Koop, D., Santos, E., Vo, H., Callahan, S., Freire, J., Silva, C.: Tackling the Provenance Challenge One Layer at a Time. Concurr. Comput.: Pract. Exper. 20(5), 473–483 (2007)
Simmhan, Y., Plale, B., Gannon, D.: A Survey of Data Provenance in e-Science. SIGMOD Record 34(3), 31–36 (2005)
Srivastava, D., Velegrakis, Y.: Intentional Associations between Data and Metadata. In: ACM SIGMOD, pp. 401–412 (2007)
Tuffery, P., Lacroix, Z., Menager, H.: Semantic Map of Services for Structural Bioinformatics. In: 18th IEEE International Conference on Scientific and Statistical Database Management, pp. 217–224. IEEE Press, Los Alamitos (2006)
UKOLN. Choosing a Metadata Standard for Resource Discovery, http://www.ukoln.ac.uk/qa-focus/documents/briefings/briefing-63/html
Gromacs User Manual - Version 3.3, http://www.gromacs.org
Scott, W.R.P., Hunenberger, P.H., Tironi, G., Mark, A.E., Billeter, S.R., Fennen, J., Torda, A.E., Huber, T., Kruger, P., van Gunsteren, W.F.: The GROMOS Biomolecular Simulation Program Package. J. Phys. Chem. A 103(19), 3596–3607 (1999)
Wieczorek, M., Prodan, R., Fahringer, T.: Scheduling of Scientific Workflows in the Askalon Grid Environment. In: ACM SIGMOD, pp. 56–62 (2005)
Woollard, D., Medvidovic, N., Gil, Y., Mattmann, C.A.: Scientific Software as Workflows: From Discovery to Distribution. IEEE Software 4(25), 37–43 (2008)
Zhao, Y., Hategan, M., Clifford, B., Foster, I., von Laszewski, G., Raicu, I., Stef-Praun, T., Wilde, M.: Swift: Fast, Reliable, Loosely Coupled Parallel Computation. In: Congress on Services - I. SERVICES, pp. 199–206. IEEE Computer Society, Washington (2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
da Cruz, S.M.S., Barros, P.M., Bisch, P.M., Campos, M.L.M., Mattoso, M. (2010). A Provenance-Based Approach to Resource Discovery in Distributed Molecular Dynamics Workflows. In: Lacroix, Z. (eds) Resource Discovery. RED 2009. Lecture Notes in Computer Science, vol 6162. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-14415-8_5
Download citation
DOI: https://doi.org/10.1007/978-3-642-14415-8_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-14414-1
Online ISBN: 978-3-642-14415-8
eBook Packages: Computer ScienceComputer Science (R0)