Skip to main content

A Multidisciplinary, Model-Driven, Distributed Science Data System Architecture

  • Chapter
  • First Online:
Guide to e-Science

Abstract

The twenty-first century has transformed the world of science by breaking the physical boundaries of distributed organizations and interconnecting them into virtual science environments, allowing for systems and systems of systems to seamlessly access and share information and resources across highly geographically distributed areas. This e-science transformation is enabling new scientific discoveries by allowing for greater collaboration as well as by enabling systems to combine and correlate disparate data sets. At the Jet Propulsion Laboratory in Pasadena, California, we have been developing science data systems for highly distributed communities in physical and life sciences that require extensive sharing of distributed services and common information models based on common architectures. The common architecture contributes a set of atomic functions, interfaces, and information models that support sharing and distributed processing. Additionally, the architecture provides a blueprint for a software product line known as the Object Oriented Data Technology (OODT) framework. OODT has enabled reuse of software for science data generation, capture and management, and delivery across highly distributed organizations for planetary science, earth science, and cancer research. Our experience to date shows that a well-defined architecture and set of accompanied software vastly improves our ability to develop road maps for and to construct virtual science environments.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://smc-it.org/workshops/crichton.html

  2. 2.

    http://www.ipcc.ch/workshops-experts-meetings-ar5-scoping.htm

References

  1. D. Crichton, J.S. Hughes, S. Kelly and J. Hyon. “Science Search and Retrieval using XML”. In Proceedings of the 2 nd National Conference on Scientific and Technical Data, Washington D.C., National Academy of Sciences. March 2000. http://oodt.jpl.nasa.gov/doc/papers/codata/paper.pdf

  2. D. Crichton, et al., “Creating a National Virtual Knowledge Environment for Proteomics and Information Management,” in Informatics and Proteomics: Marcel Dekker Publishers, 2005.

    Google Scholar 

  3. D. Crichton, et al., “Facilitating Climate Modeling Research and Analysis via the Climate Data eXchange,” In Proc. Workshop on Global Organization for Earth System Science Portals (GO-ESSP), Seattle, WA, 2008.

    Google Scholar 

  4. J. S. Hughes, et al., “The Semantic Planetary Data System,” In Proc. 3rd Symposium on Ensuring Long-term Preservation and Adding Value to Scientific and Technical Data, The Royal Society, Edinburgh, UK, 2005.

    Google Scholar 

  5. C. Mattmann, D. Crichton, N. Medvidovic and S. Hughes. “A Software Architecture-Based Framework for Highly Distributed and Data Intensive Scientific Applications”. In Proceedings of the 28th International Conference on Software Engineering (ICSE06), pp. 721–730, Shanghai, China, May 20th–28th, 2006.

    Google Scholar 

  6. J. S. Hughes, et al., “Intelligent Resource Discovery using Ontology-based Resource Profiles,” Data Science Journal, 2005.

    Google Scholar 

  7. C. Mattmann, et al., “A Reusable Process Control System Framework for the Orbiting Carbon Observatory and NPP Sounder PEATE missions,” in Submitted to 3rd IEEE Intl’ Conference on Space Mission Challenges for Information Technology (SMC-IT 2009), 2009.

    Google Scholar 

  8. “Reference Model for an Open Archival Information System (OAIS),” CCSDS 650.0-B-1, 2002.

    Google Scholar 

  9. D. Crichton, S. Kelly, C. Mattmann, Q. Xiao, J. S. Hughes, J. Oh, M. Thornquist, D. Johnsey, S. Srivastava, L. Esserman, and B. Bigbee. “A Distributed Information Services Architecture to Support Biomarker Discovery in Early Detection of Cancer”. Accepted for publication at the 2nd IEEE International Conference on e-Science and Grid Computing, Amsterdam, the Netherlands, December 4th-6th, 2006.

    Google Scholar 

  10. R. N. Taylor, N. Medvidovic and E. Dashofy. Software Architecture: Foundations, Theory and Practice. Wiley Press, 2009.

    Google Scholar 

  11. Apache Tika. http://lucene.apache.org/tika/, 2010.

  12. ISO/IEC CD 11179–3 Information Technology – Data Management and Interchange – Metadata Registries (MDR) – Part 3: Registry Metamodel (MDR3) (2002). http://www.jtc1sc32.org/sc32/jtc1sc32.nsf/Attachments/00DEC39D41D17B1288256A5300603FED

  13. S. Weibel, J. Kunze, C. Lagoze, M. Wolf. Dublin Core Metadata for Resource Discovery. Internet Engineering Task Force RFC, 1998.

    Google Scholar 

  14. P. Cornillon, J. Gallagher, and T. Sgouros. Opendap: Accessing data in a distributed, heterogeneous environment. Data Science Journal, 2:164–174, 2003.

    Article  Google Scholar 

  15. M. Cook. Building Enterprise Information Architectures: Reengineering Information Systems. Prentice-Hall, 1996.

    Google Scholar 

  16. Apache Lucene, http://lucene.apache.org/, 2010.

  17. X. Yang, L. Wang, G. von Laszewski. Recent Research Advances in e-Science. Cluster Computing, vol. 12, pp. 353–356, 2009.

    Article  Google Scholar 

  18. Gorton, P. Greenfield, A. Szalay and R. Williams. Data-Intensive Computing in the 21st Century. IEEE Computer, vol. 41, no. 4., p. 30, 2008.

    Google Scholar 

  19. R. T. Kouzes, G. A. Anderson, S. T. Elbert, I. Gorton, and D. K. Gracio. The changing paradigm of data-intensive computing. IEEE Computer, vol. 42, no. 1, pp. 26–34, 2009.

    Google Scholar 

  20. S. S. Laurent, J. Johnston and E. Dumbill. Programming web services with XML-RPC. O’Reilly Media, 2001.

    Google Scholar 

  21. O. Lassila and R. R. Swick. Resource description framework (RDF) model and syntax, World Wide Web Consortium, http://www.w3.org/TR/WD-rdf-syntax, 2010.

  22. R. Fielding and R. N. Taylor. Principled Design of the Modern Web Architecture. ACM Transactions on Internet Technology (TOIT), vol. 2., no. 2., pp. 115–150, 2002.

    Article  Google Scholar 

  23. T. Berners-Lee and J. Hendler. Scientific publishing on the semantic web. Nature, vol. 410, pp. 1023–1024, 2001.

    Article  Google Scholar 

  24. Y. Tina Lee (1999). “Information modeling from design to implementation” National Institute of Standards and Technology.

    Google Scholar 

  25. M. Uschold and G. M., “Ontologies and Semantics for Seamless Connectivity,” SIGMOD Record, vol. 33, 2004.

    Google Scholar 

  26. CODMAC, Data Management and Computation, Vol. 1: Issues and Recommendations. Committee on Data Management and Computation, Space Sciences Board. Assembly of Mathematical and Physical Sciences, National Research Council, 1982.

    Google Scholar 

  27. D. Crichton. Core Standards and Implementation of the International Planetary Data Alliance. 37th COSPAR Scientific Assembly. vol. 37, pp. 600, 2008.

    Google Scholar 

  28. IPCC Intergovernmental Panel on Climate Change, http://www.ipcc.ch/, 2010.

  29. C. Mattmann, A. Braverman, D. Crichton. Understanding Architectural Tradeoffs Necessary to Increase Climate Model Intercomparison Efficiency. ACM SIGSOFT Software Engineering Notes, vol. 35, no. 3, July 2010.

    Google Scholar 

  30. B Fortner. Hdf: The hierarchical data format. Dr Dobb’s J. Software Tools and Professional Programming, 1998.

    Google Scholar 

  31. R. K. Rew and G. P. Davis. Netcdf: An interface for scientific data access. IEEE Computer Graphics and Applications, 10(4):76–82, 1990.

    Article  Google Scholar 

  32. P. Cornillon, J. Gallagher, and T. Sgouros. Opendap: Accessing data in a distributed, heterogeneous environment. Data Science Journal, 2:164–174, 2003.

    Article  Google Scholar 

  33. Hart, C. Mattmann, J. Tran, D. Crichton, H. Kincaid, J. S. Hughes, S. Kelly, K. Anton, D. Johnsey, C. Patriotis. Enabling Effective Curation of Cancer Biomarker Research Data. In Proceedings of the 22nd IEEE International Symposium on Computer-Based Medical Systems (CBMS), Albuquerque, NM, August 3rd-4th, 2009.

    Google Scholar 

  34. T. Hey and A. Trefethen. The UK e-Science Core Programme and the Grid. Computational Science, vol. 2329/2002, pp. 3-21, 2002.

    Google Scholar 

  35. X. Yang, et al. Recent Advances in e-Science. Cluster Computing, vol. 12, pp. 353–356, 2009.

    Article  Google Scholar 

Download references

Acknowledgment

The research was carried out at the Jet Propulsion Laboratory, California Institute of Technology, under a contract with the National Aeronautics and Space Administration.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Daniel J. Crichton .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag London Limited

About this chapter

Cite this chapter

Crichton, D.J., Mattmann, C.A., Hughes, J.S., Kelly, S.C., Hart, A.F. (2011). A Multidisciplinary, Model-Driven, Distributed Science Data System Architecture. In: Yang, X., Wang, L., Jie, W. (eds) Guide to e-Science. Computer Communications and Networks. Springer, London. https://doi.org/10.1007/978-0-85729-439-5_5

Download citation

  • DOI: https://doi.org/10.1007/978-0-85729-439-5_5

  • Published:

  • Publisher Name: Springer, London

  • Print ISBN: 978-0-85729-438-8

  • Online ISBN: 978-0-85729-439-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics