Abstract
The twenty-first century has transformed the world of science by breaking the physical boundaries of distributed organizations and interconnecting them into virtual science environments, allowing for systems and systems of systems to seamlessly access and share information and resources across highly geographically distributed areas. This e-science transformation is enabling new scientific discoveries by allowing for greater collaboration as well as by enabling systems to combine and correlate disparate data sets. At the Jet Propulsion Laboratory in Pasadena, California, we have been developing science data systems for highly distributed communities in physical and life sciences that require extensive sharing of distributed services and common information models based on common architectures. The common architecture contributes a set of atomic functions, interfaces, and information models that support sharing and distributed processing. Additionally, the architecture provides a blueprint for a software product line known as the Object Oriented Data Technology (OODT) framework. OODT has enabled reuse of software for science data generation, capture and management, and delivery across highly distributed organizations for planetary science, earth science, and cancer research. Our experience to date shows that a well-defined architecture and set of accompanied software vastly improves our ability to develop road maps for and to construct virtual science environments.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
D. Crichton, J.S. Hughes, S. Kelly and J. Hyon. “Science Search and Retrieval using XML”. In Proceedings of the 2 nd National Conference on Scientific and Technical Data, Washington D.C., National Academy of Sciences. March 2000. http://oodt.jpl.nasa.gov/doc/papers/codata/paper.pdf
D. Crichton, et al., “Creating a National Virtual Knowledge Environment for Proteomics and Information Management,” in Informatics and Proteomics: Marcel Dekker Publishers, 2005.
D. Crichton, et al., “Facilitating Climate Modeling Research and Analysis via the Climate Data eXchange,” In Proc. Workshop on Global Organization for Earth System Science Portals (GO-ESSP), Seattle, WA, 2008.
J. S. Hughes, et al., “The Semantic Planetary Data System,” In Proc. 3rd Symposium on Ensuring Long-term Preservation and Adding Value to Scientific and Technical Data, The Royal Society, Edinburgh, UK, 2005.
C. Mattmann, D. Crichton, N. Medvidovic and S. Hughes. “A Software Architecture-Based Framework for Highly Distributed and Data Intensive Scientific Applications”. In Proceedings of the 28th International Conference on Software Engineering (ICSE06), pp. 721–730, Shanghai, China, May 20th–28th, 2006.
J. S. Hughes, et al., “Intelligent Resource Discovery using Ontology-based Resource Profiles,” Data Science Journal, 2005.
C. Mattmann, et al., “A Reusable Process Control System Framework for the Orbiting Carbon Observatory and NPP Sounder PEATE missions,” in Submitted to 3rd IEEE Intl’ Conference on Space Mission Challenges for Information Technology (SMC-IT 2009), 2009.
“Reference Model for an Open Archival Information System (OAIS),” CCSDS 650.0-B-1, 2002.
D. Crichton, S. Kelly, C. Mattmann, Q. Xiao, J. S. Hughes, J. Oh, M. Thornquist, D. Johnsey, S. Srivastava, L. Esserman, and B. Bigbee. “A Distributed Information Services Architecture to Support Biomarker Discovery in Early Detection of Cancer”. Accepted for publication at the 2nd IEEE International Conference on e-Science and Grid Computing, Amsterdam, the Netherlands, December 4th-6th, 2006.
R. N. Taylor, N. Medvidovic and E. Dashofy. Software Architecture: Foundations, Theory and Practice. Wiley Press, 2009.
Apache Tika. http://lucene.apache.org/tika/, 2010.
ISO/IEC CD 11179–3 Information Technology – Data Management and Interchange – Metadata Registries (MDR) – Part 3: Registry Metamodel (MDR3) (2002). http://www.jtc1sc32.org/sc32/jtc1sc32.nsf/Attachments/00DEC39D41D17B1288256A5300603FED
S. Weibel, J. Kunze, C. Lagoze, M. Wolf. Dublin Core Metadata for Resource Discovery. Internet Engineering Task Force RFC, 1998.
P. Cornillon, J. Gallagher, and T. Sgouros. Opendap: Accessing data in a distributed, heterogeneous environment. Data Science Journal, 2:164–174, 2003.
M. Cook. Building Enterprise Information Architectures: Reengineering Information Systems. Prentice-Hall, 1996.
Apache Lucene, http://lucene.apache.org/, 2010.
X. Yang, L. Wang, G. von Laszewski. Recent Research Advances in e-Science. Cluster Computing, vol. 12, pp. 353–356, 2009.
Gorton, P. Greenfield, A. Szalay and R. Williams. Data-Intensive Computing in the 21st Century. IEEE Computer, vol. 41, no. 4., p. 30, 2008.
R. T. Kouzes, G. A. Anderson, S. T. Elbert, I. Gorton, and D. K. Gracio. The changing paradigm of data-intensive computing. IEEE Computer, vol. 42, no. 1, pp. 26–34, 2009.
S. S. Laurent, J. Johnston and E. Dumbill. Programming web services with XML-RPC. O’Reilly Media, 2001.
O. Lassila and R. R. Swick. Resource description framework (RDF) model and syntax, World Wide Web Consortium, http://www.w3.org/TR/WD-rdf-syntax, 2010.
R. Fielding and R. N. Taylor. Principled Design of the Modern Web Architecture. ACM Transactions on Internet Technology (TOIT), vol. 2., no. 2., pp. 115–150, 2002.
T. Berners-Lee and J. Hendler. Scientific publishing on the semantic web. Nature, vol. 410, pp. 1023–1024, 2001.
Y. Tina Lee (1999). “Information modeling from design to implementation” National Institute of Standards and Technology.
M. Uschold and G. M., “Ontologies and Semantics for Seamless Connectivity,” SIGMOD Record, vol. 33, 2004.
CODMAC, Data Management and Computation, Vol. 1: Issues and Recommendations. Committee on Data Management and Computation, Space Sciences Board. Assembly of Mathematical and Physical Sciences, National Research Council, 1982.
D. Crichton. Core Standards and Implementation of the International Planetary Data Alliance. 37th COSPAR Scientific Assembly. vol. 37, pp. 600, 2008.
IPCC Intergovernmental Panel on Climate Change, http://www.ipcc.ch/, 2010.
C. Mattmann, A. Braverman, D. Crichton. Understanding Architectural Tradeoffs Necessary to Increase Climate Model Intercomparison Efficiency. ACM SIGSOFT Software Engineering Notes, vol. 35, no. 3, July 2010.
B Fortner. Hdf: The hierarchical data format. Dr Dobb’s J. Software Tools and Professional Programming, 1998.
R. K. Rew and G. P. Davis. Netcdf: An interface for scientific data access. IEEE Computer Graphics and Applications, 10(4):76–82, 1990.
P. Cornillon, J. Gallagher, and T. Sgouros. Opendap: Accessing data in a distributed, heterogeneous environment. Data Science Journal, 2:164–174, 2003.
Hart, C. Mattmann, J. Tran, D. Crichton, H. Kincaid, J. S. Hughes, S. Kelly, K. Anton, D. Johnsey, C. Patriotis. Enabling Effective Curation of Cancer Biomarker Research Data. In Proceedings of the 22nd IEEE International Symposium on Computer-Based Medical Systems (CBMS), Albuquerque, NM, August 3rd-4th, 2009.
T. Hey and A. Trefethen. The UK e-Science Core Programme and the Grid. Computational Science, vol. 2329/2002, pp. 3-21, 2002.
X. Yang, et al. Recent Advances in e-Science. Cluster Computing, vol. 12, pp. 353–356, 2009.
Acknowledgment
The research was carried out at the Jet Propulsion Laboratory, California Institute of Technology, under a contract with the National Aeronautics and Space Administration.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag London Limited
About this chapter
Cite this chapter
Crichton, D.J., Mattmann, C.A., Hughes, J.S., Kelly, S.C., Hart, A.F. (2011). A Multidisciplinary, Model-Driven, Distributed Science Data System Architecture. In: Yang, X., Wang, L., Jie, W. (eds) Guide to e-Science. Computer Communications and Networks. Springer, London. https://doi.org/10.1007/978-0-85729-439-5_5
Download citation
DOI: https://doi.org/10.1007/978-0-85729-439-5_5
Published:
Publisher Name: Springer, London
Print ISBN: 978-0-85729-438-8
Online ISBN: 978-0-85729-439-5
eBook Packages: Computer ScienceComputer Science (R0)