Advertisement

A Multidisciplinary, Model-Driven, Distributed Science Data System Architecture

  • Daniel J. Crichton
  • Chris A. Mattmann
  • John S. Hughes
  • Sean C. Kelly
  • Andrew F. Hart
Chapter
Part of the Computer Communications and Networks book series (CCN)

Abstract

The twenty-first century has transformed the world of science by breaking the physical boundaries of distributed organizations and interconnecting them into virtual science environments, allowing for systems and systems of systems to seamlessly access and share information and resources across highly geographically distributed areas. This e-science transformation is enabling new scientific discoveries by allowing for greater collaboration as well as by enabling systems to combine and correlate disparate data sets. At the Jet Propulsion Laboratory in Pasadena, California, we have been developing science data systems for highly distributed communities in physical and life sciences that require extensive sharing of distributed services and common information models based on common architectures. The common architecture contributes a set of atomic functions, interfaces, and information models that support sharing and distributed processing. Additionally, the architecture provides a blueprint for a software product line known as the Object Oriented Data Technology (OODT) framework. OODT has enabled reuse of software for science data generation, capture and management, and delivery across highly distributed organizations for planetary science, earth science, and cancer research. Our experience to date shows that a well-defined architecture and set of accompanied software vastly improves our ability to develop road maps for and to construct virtual science environments.

Keywords

Resource Description Framework Software Product Line Information Architecture Microwave Limb Sound Planetary Science 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Notes

Acknowledgment

The research was carried out at the Jet Propulsion Laboratory, California Institute of Technology, under a contract with the National Aeronautics and Space Administration.

References

  1. 1.
    D. Crichton, J.S. Hughes, S. Kelly and J. Hyon. “Science Search and Retrieval using XML”. In Proceedings of the 2 nd National Conference on Scientific and Technical Data, Washington D.C., National Academy of Sciences. March 2000. http://oodt.jpl.nasa.gov/doc/papers/codata/paper.pdf
  2. 2.
    D. Crichton, et al., “Creating a National Virtual Knowledge Environment for Proteomics and Information Management,” in Informatics and Proteomics: Marcel Dekker Publishers, 2005. Google Scholar
  3. 3.
    D. Crichton, et al., “Facilitating Climate Modeling Research and Analysis via the Climate Data eXchange,” In Proc. Workshop on Global Organization for Earth System Science Portals (GO-ESSP), Seattle, WA, 2008.Google Scholar
  4. 4.
    J. S. Hughes, et al., “The Semantic Planetary Data System,” In Proc. 3rd Symposium on Ensuring Long-term Preservation and Adding Value to Scientific and Technical Data, The Royal Society, Edinburgh, UK, 2005.Google Scholar
  5. 5.
    C. Mattmann, D. Crichton, N. Medvidovic and S. Hughes. “A Software Architecture-Based Framework for Highly Distributed and Data Intensive Scientific Applications”. In Proceedings of the 28th International Conference on Software Engineering (ICSE06), pp. 721–730, Shanghai, China, May 20th–28th, 2006.Google Scholar
  6. 6.
    J. S. Hughes, et al., “Intelligent Resource Discovery using Ontology-based Resource Profiles,” Data Science Journal, 2005.Google Scholar
  7. 7.
    C. Mattmann, et al., “A Reusable Process Control System Framework for the Orbiting Carbon Observatory and NPP Sounder PEATE missions,” in Submitted to 3rd IEEE Intl’ Conference on Space Mission Challenges for Information Technology (SMC-IT 2009), 2009.Google Scholar
  8. 8.
    “Reference Model for an Open Archival Information System (OAIS),” CCSDS 650.0-B-1, 2002.Google Scholar
  9. 9.
    D. Crichton, S. Kelly, C. Mattmann, Q. Xiao, J. S. Hughes, J. Oh, M. Thornquist, D. Johnsey, S. Srivastava, L. Esserman, and B. Bigbee. “A Distributed Information Services Architecture to Support Biomarker Discovery in Early Detection of Cancer”. Accepted for publication at the 2nd IEEE International Conference on e-Science and Grid Computing, Amsterdam, the Netherlands, December 4th-6th, 2006.Google Scholar
  10. 10.
    R. N. Taylor, N. Medvidovic and E. Dashofy. Software Architecture: Foundations, Theory and Practice. Wiley Press, 2009.Google Scholar
  11. 11.
    Apache Tika. http://lucene.apache.org/tika/, 2010.
  12. 12.
    ISO/IEC CD 11179–3 Information Technology – Data Management and Interchange – Metadata Registries (MDR) – Part 3: Registry Metamodel (MDR3) (2002). http://www.jtc1sc32.org/sc32/jtc1sc32.nsf/Attachments/00DEC39D41D17B1288256A5300603FED
  13. 13.
    S. Weibel, J. Kunze, C. Lagoze, M. Wolf. Dublin Core Metadata for Resource Discovery. Internet Engineering Task Force RFC, 1998.Google Scholar
  14. 14.
    P. Cornillon, J. Gallagher, and T. Sgouros. Opendap: Accessing data in a distributed, heterogeneous environment. Data Science Journal, 2:164–174, 2003.CrossRefGoogle Scholar
  15. 15.
    M. Cook. Building Enterprise Information Architectures: Reengineering Information Systems. Prentice-Hall, 1996.Google Scholar
  16. 16.
    Apache Lucene, http://lucene.apache.org/, 2010.
  17. 17.
    X. Yang, L. Wang, G. von Laszewski. Recent Research Advances in e-Science. Cluster Computing, vol. 12, pp. 353–356, 2009.CrossRefGoogle Scholar
  18. 18.
    Gorton, P. Greenfield, A. Szalay and R. Williams. Data-Intensive Computing in the 21st Century. IEEE Computer, vol. 41, no. 4., p. 30, 2008.Google Scholar
  19. 19.
    R. T. Kouzes, G. A. Anderson, S. T. Elbert, I. Gorton, and D. K. Gracio. The changing paradigm of data-intensive computing. IEEE Computer, vol. 42, no. 1, pp. 26–34, 2009.Google Scholar
  20. 20.
    S. S. Laurent, J. Johnston and E. Dumbill. Programming web services with XML-RPC. O’Reilly Media, 2001.Google Scholar
  21. 21.
    O. Lassila and R. R. Swick. Resource description framework (RDF) model and syntax, World Wide Web Consortium, http://www.w3.org/TR/WD-rdf-syntax, 2010.
  22. 22.
    R. Fielding and R. N. Taylor. Principled Design of the Modern Web Architecture. ACM Transactions on Internet Technology (TOIT), vol. 2., no. 2., pp. 115–150, 2002.CrossRefGoogle Scholar
  23. 23.
    T. Berners-Lee and J. Hendler. Scientific publishing on the semantic web. Nature, vol. 410, pp. 1023–1024, 2001.CrossRefGoogle Scholar
  24. 24.
    Y. Tina Lee (1999). “Information modeling from design to implementation” National Institute of Standards and Technology.Google Scholar
  25. 25.
    M. Uschold and G. M., “Ontologies and Semantics for Seamless Connectivity,” SIGMOD Record, vol. 33, 2004.Google Scholar
  26. 26.
    CODMAC, Data Management and Computation, Vol. 1: Issues and Recommendations. Committee on Data Management and Computation, Space Sciences Board. Assembly of Mathematical and Physical Sciences, National Research Council, 1982. Google Scholar
  27. 27.
    D. Crichton. Core Standards and Implementation of the International Planetary Data Alliance. 37th COSPAR Scientific Assembly. vol. 37, pp. 600, 2008.Google Scholar
  28. 28.
    IPCC Intergovernmental Panel on Climate Change, http://www.ipcc.ch/, 2010.
  29. 29.
    C. Mattmann, A. Braverman, D. Crichton. Understanding Architectural Tradeoffs Necessary to Increase Climate Model Intercomparison Efficiency. ACM SIGSOFT Software Engineering Notes, vol. 35, no. 3, July 2010.Google Scholar
  30. 30.
    B Fortner. Hdf: The hierarchical data format. Dr Dobb’s J. Software Tools and Professional Programming, 1998.Google Scholar
  31. 31.
    R. K. Rew and G. P. Davis. Netcdf: An interface for scientific data access. IEEE Computer Graphics and Applications, 10(4):76–82, 1990. CrossRefGoogle Scholar
  32. 32.
    P. Cornillon, J. Gallagher, and T. Sgouros. Opendap: Accessing data in a distributed, heterogeneous environment. Data Science Journal, 2:164–174, 2003. CrossRefGoogle Scholar
  33. 33.
    Hart, C. Mattmann, J. Tran, D. Crichton, H. Kincaid, J. S. Hughes, S. Kelly, K. Anton, D. Johnsey, C. Patriotis. Enabling Effective Curation of Cancer Biomarker Research Data. In Proceedings of the 22nd IEEE International Symposium on Computer-Based Medical Systems (CBMS), Albuquerque, NM, August 3rd-4th, 2009.Google Scholar
  34. 34.
    T. Hey and A. Trefethen. The UK e-Science Core Programme and the Grid. Computational Science, vol. 2329/2002, pp. 3-21, 2002.Google Scholar
  35. 35.
    X. Yang, et al. Recent Advances in e-Science. Cluster Computing, vol. 12, pp. 353–356, 2009.CrossRefGoogle Scholar

Copyright information

© Springer-Verlag London Limited 2011

Authors and Affiliations

  • Daniel J. Crichton
    • 1
  • Chris A. Mattmann
    • 1
    • 2
  • John S. Hughes
    • 1
  • Sean C. Kelly
    • 1
  • Andrew F. Hart
    • 1
  1. 1.Jet Propulsion LaboratoryCalifornia Institute of TechnologyPasadenaUSA
  2. 2.Computer Science DepartmentUniversity of Southern CaliforniaLos AngelesUSA

Personalised recommendations