Abstract
Advances in numerical modeling, computational hardware and problem solving environments have driven the growth of computational science over the past decades. Science gateways, based on service oriented architectures and scientific workflows, provide yet another step in democratizing access to advanced numerical and scientific tools, computational resource and massive data storage, and fostering collaborations. Dynamic, data-driven applications, such as those found in weather forecasting, present interesting challenges to Science Gateways, which are being addressed as part of the LEAD Cyberinfrastructure project. In this article, we discuss three important data related problems faced by such adaptive data-driven environments: managing a user’s personal workspace and metadata on the Grid, tracking the provenance of scientific workflows and data products, and continuous data mining over observational weather data.
Please use the following format when citing this chapter: Simmhan, Y., L.. Pallekara. S. L., Vijayakumar, N. N., Pale, B., 2007, in IFIP International Federation ior Information Proeessing, Volume 239, Grid-Based Problem Solving Environments, eds. Gaffney, P. W., Pool, J.C.T., (Boston: Springer), pp. 317–333.
Chapter PDF
Similar content being viewed by others
Key words
References
Tony Andrews, Francisco Curbera, Hitesh Dholakia, Yaron Goland, Johannes Klein, Frank Leymann, Kevin Liu, Dieter Roller, Doug Smith, Satish Thatte, Ivana Trickovic, and Sanjiva Weerawarana. Business Process Execution Language for Web Services Version 1.1. BEA Systems and International Business Machines Corporation andMicrosoft Corporation and SAP AG and Siebel Systems, 2003.
Mario Antonioletti, Malcolm Atkinson, Rob Baxter, Andrew Borley, Neil P. Chue Hong, Brian Collins, Neil Hardman, Alastair C. Hume, Alan Knox, Mike Jackson, Amy Krause, Simon Laws, James Magowan, Norman W. Paton, Dave Pearson, Tom Sugden, Paul Watson, and Martin Westhead. The design and implementation of grid database services in ogsa-dai: Research articles. Concurrency and Computation: Practice and Experience, 17(2-4): 357–376, 2005.
Rob Armstrong, Dennis Gannon, AI Geist, Katarzyna Keahey, Scott Kohn, Lois Mclnnes, Steve Parker, and Brent Smolinski. Toward a common component architecture for high-performance scientific computing. In High Performance Distributed Computing Conference, 1999.
Gordon Bell, Jim Gray, and Alex Szalay. Petascale computational systems. Computer, 39(l): 110–112, 2006.
Rajendra Bose and James Frew. Lineage Retrieval for Scientific Data Processing: A Survey. ACM Computing Surveys, 37(1): 128, 2005.
Charlie Catlett. The TeraGrid: A Primer. TeraGrid, 2002.
Ann Chervenak, Robert Schuler, Carl Kesselman, Scott Koranda, and Brian Moe. Wide area data replication for scientific collaborations. In Workshop on Grid Computing, 2005.
Ben Domenico, John Caron, Ethan Davis, Robb Kambic, and Stefano Nativi. Thematic real-time environmental distributed data services (thredds): Incorporating interactive analysis tools into nsdl. Digital Information, 2(4), 2002.
Kelvin K. Droegemeier, Dennis Gannon, Daniel Reed, Beth Plale, Jay Alameda, Tom Baltzer, Keith Brewster, Richard Clark, Ben Domenico, Sara Graves, Everette Joseph, Donald Murray, Rahul Ramachandran, Mohan Ramamurthy, Lavanya Ramakrishnan, John A. Rushing, Daniel Weber, Robert Wilhelmson, Anne Wilson, Ming Xue, and Sepideh Yalda. Service-oriented environments for dynamically interacting with mesoscale weather. Computing in Science and Engineering, 7(6): 12–29, 2005.
Ian Foster, Hiro Kishimoto, Andreas Savva, Dave Berry, Andrew Grimshaw, Bill Horn, Fred Maciel, Frank Siebenlist, Ravi Subramaniam, Jem Tread well, and Jeffrin Von Reich. The Open Grid Services Architecture, Version 1.5. Global Grid Forum, 2006.
Dennis Gannon, Jay Alameda, Octav Chipara, Marcus Christie, Vinayak Dukle, Liang Fang, Matthew Farellee, Geoffrey Fox, Shawn Hampton, Gopi Kandaswamy, Deepti Kodeboyina, Charlie Moad, Marlon Pierce, Beth Plale, Albert Rossi, Yogesh Simmhan, Anuraag Sarangi, Aleksander Slominski, Satoshi Shirasauna, and Thomas Thomas. Building grid portal applications from a web-service component architecture. Proceedings of the IEEE, 93(3): 551–563, 2005.
Dennis Gannon, Beth Plale, Marcus Christie, Liang Fang, Yi Huang, Scott Jensen, Gopi Kandaswamy, Suresh Marru, Sangmi Lee Pallickara, Satoshi Shirasuna, Yogesh Simmhan, Aleksander Slominski, and Yiming Sun. Service oriented architectures for science gateways on grid systems. In International Conference on Service Oriented Computing, 2005.
Dennis Gannon, Beth Plale, Suresh Marru, Gopi Kandaswamy, Yogesh Simmhan, and Satoshi Shirasuna. Workflows for eScience: Scientific Workflows for Grids, chapter Dynamic, Adaptive Workflows for Mesoscale Meteorology. Springer-Verlag, 2006.
Carole Goble, Chris Wroe, Robert Stevens, and the myGrid consortium. The my-grid project: services, architecture and demonstrator. In UK e-Science programme All Hands Meeting, 2003.
N. Halbwachs, P. Caspi, P. Raymond, and D. Pilaud. The synchronous data-flow programming language LUSTRE. Proceedings of the IEEE, 79(9): 1305–1320, 1991.
Elias N. Houstis, John R. Rice, Efstratios Gallopoulos, and Randall Bramley, editors. Enabling Technologies for Computational Science: Frameworks, Middleware and Environments, chapter 1, pages 7–17. Kluwer Academic, 2000.
Yi Huang, Alek Slominski, Chatura Herath, and Dennis Gannon. WS-Messenger: A Web Services based Messaging System for Service-Oriented Grid Computing. In Cluster Computing and Grid Conference, 2006.
Scott Jensen, Beth Plale, Sangmi Lee Pallickara, and Yiming Sun. A hybrid xml-relational grid metadata catalog. In International Conference Workshops on Parallel Processing, 2006.
Gopi Kandaswamy, Liang Fang, Yi Huang, Satoshi Shirasuna, Suresh Marru, and Dennis Gannon. Building Web Services for Scientific Grid Applications. IBM Journal of Research and Development, 50(2/3): 249–260, 2006.
Richard A. Kerr. Storm-in-a-box forecasting. Science, 304(5673): 946–468, 2004.
Sriram Krishnan, Randall Bramley, Dennis Gannon, Rachana Ananthakrishnan, Madhusudhan Govindaraju, Aleksander Slominski, Yogesh Simmhan, Jay Alameda, Richard Alkire, Timothy Drews, and Eric Webb. The xcat science portal. Journal of Scientific Programming, 10(4): 303–317, 2002.
Xiang Li, Rahul Ramachandran, John Rushing, Sara Graves, Kevin Kelleher, S. Lakshmivarahan, Douglas Kennedy, and Jason Levit. Mining nexrad radar data: An investigative study. In Interactive Information and Processing Systems. American Meteorological Society, 2004.
Ying Liu and Beth Plale. Query optimization for distributed data streams. In Software Engineering and Data Engineering Conference, 2006.
Ying Liu, Beth Plale, and Nithya Vijayakumar. Realization of ggf dais data service interface for grid access to data streams. Technical Report 613, Indiana University, Computer Science Department, 2005.
Ying Liu, Nithya N. Vijayakumar, and Beth Plale. Stream processing in data-driven computational science. In Grid Conference, 2006.
Acopia Networks. File virtualization with the acopia arx. Technical report, Acopia Networks, 2005.
Beth Plale. Leveraging run time knowledge about event rates to improve memory utilization in wide area data stream filtering. In High Performance Distributed Computing Conference, 2002.
Beth Plale. Usage study for data storage repository in lead. Technical Report 001, LEAD, 2005.
Beth Plale, Dennis Gannon, Yi Huang, Gopi Kandaswamy, Sangmi Lee Pallickara, and Aleksander Slominski. Cooperating services for data-driven computational experimentation. Computing in Science and Engineering, 07(5): 34–43, 2005.
Beth Plale, Rahul Ramachandran, and Steve Tanner. Data management support for adaptive analysis and prediction of the atmosphere in lead. In Conference on Interactive Information Processing Systems for Meteorology, Oceanography, and Hydrology, 2006.
Arcot Rajasekar, Michael Wan, and Reagan Moore. Mysrb & srb: Components of a data grid. In High Performance Distributed Computing Conference, 2002.
Kurt Riesselmann. 600 US scientists + 3500 scientists from other countries = The New High-Energy Frontier. Symmetry, 2(3): 18–21, 2005.
Satoshi Shirasuna and Dennis Gannon. Xbaya: A graphical workflow composer for the web services architecture. Technical Report 004, LEAD, 2006.
Yogesh Simmhan, Beth Plale, and Dennis Gannon. A survey of data provenance in e-science. SIGMOD Record, 34(3): 31–36, 2005.
Yogesh L. Simmhan, Beth Plale, and Dennis Gannon. A Framework for Collecting Provenance in Data-Centric Scientific Workflows. In International Conference on Web Services, 2006.
Yogesh L. Simmhan, Beth Plale, and Dennis Gannon. Performance evaluation of the karma provenance framework for scientific workflows. LNCS, 4145, 2006.
Yogesh L. Simmhan, Beth Plale, and Dennis Gannon. Towards a Quality Model for Effective Data Selection in Collaboratories. In IEEE Workshop on Scientific Workflows and Dataflows, 2006.
Gurmeet Singh, Shishir Bharathi, Ann Chervenak, Ewa Deelman, Carl Kesselman, Mary Manohar, Sonal Patil, and Laura Pearlman. A metadata catalog service for data intensive applications. In ACM Supercomputing Conference, 2003.
Alek Slominski. Workflows for e-Science, chapter Adapting BPEL to Scientific Workflows. Springer-Verlag, 2006. In Press.
Dennis E. Stevenson. Science, computational science, and computer science: at a crossroads. In Conference on Computer Science. ACM Press, 1993.
Nithya N. Vijayakumar, Ying Liu, and Beth Plale. Calder query grid service: Insights and experimental evaluation. In Cluster Computing and Grid Conference, 2006.
Nithya N. Vijayakumar and Beth Plale. Towards low overhead provenance tracking in near real-time stream filtering. LNCS, 4145, 2006.
Nithya N. Vijayakumar, Beth Plale, Rahul Ramachandran, and Xiang Li. Dynamic filtering and mining triggers in mesoscale meteorology forecasting. In International Geoscience and Remote Sensing Symposium, 2006.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2007 International Federation for Information Processing
About this paper
Cite this paper
Simmhan, Y.L., Pallickara, S.L., Vijayakumar, N.N., Plale, B. (2007). Data Management in Dynamic Environment-driven Computational Science. In: Gaffney, P.W., Pool, J.C.T. (eds) Grid-Based Problem Solving Environments. IFIP The International Federation for Information Processing, vol 239. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-73659-4_17
Download citation
DOI: https://doi.org/10.1007/978-0-387-73659-4_17
Publisher Name: Springer, Boston, MA
Print ISBN: 978-0-387-73658-7
Online ISBN: 978-0-387-73659-4
eBook Packages: Computer ScienceComputer Science (R0)