Skip to main content

Scientific Mashups: Runtime-Configurable Data Product Ensembles

  • Conference paper
Scientific and Statistical Database Management (SSDBM 2009)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5566))

Abstract

Mashups are gaining popularity as a rapid-development, re-use-oriented programming model to replace monolithic, bottom-up application development. This programming style is attractive for the “long tail” of scientific data management applications, characterized by exploding data volumes, increasing requirements for data sharing and collaboration, but limited software engineering budgets.

We observe that scientists already routinely construct a primitive, static form of mashup—an ensemble of related visualizations that convey a specific scientific message encoded as, e.g., a Powerpoint slide. Inspired by their ubiquity, we adopt these conventional data-product ensembles as a core model, endow them with interactivity, publish them online, and allow them to be repurposed at runtime by non-programmers.

We observe that these scientific mashups must accommodate a wider audience than commerce-oriented and entertainment-oriented mashups. Collaborators, students (K12 through graduate), the public, and policy makers are all potential consumers, but each group has a different level of domain sophistication. We explore techniques for adapting one mashup for different audiences by attaching additional context, assigning defaults, and re-skinning component products.

Existing mashup frameworks (and scientific workflow systems) emphasize an expressive “boxes-and-arrows” abstraction suitable for engineering individual products but overlook requirements for organizing products into synchronized ensembles or repurposing them for different audiences.

In this paper, we articulate these requirements for scientific mashups, describe an architecture for composing mashups as interactive, reconfigurable, web-based, visualization-oriented data product ensembles, and report on an initial implementation in use at an Ocean Observatory.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Barga, R., Jackson, J., Araujo, N., Guo, D., Gautam, N., Grochow, K., Lazowska, E.: Trident: Scientific Workflow Workbench for Oceanography. In: IEEE Congress on Services, pp. 465–466. IEEE Computer Society, Los Alamitos (2008)

    Google Scholar 

  2. CrimeMapping (2008), http://www.crimemapping.com/

  3. Deelman, E., Singh, G., Su, M.H., Blythe, J., Gil, Y., Kesselman, C., Mehta, G., Vahi, K., Berriman, G.B., Good, J., Laity, A., Jacob, J.C., Katz, D.S.: Pegasus: a Framework for Mapping Complex Scientific Workflows onto Distributed Systems. Scientific Programming Journal 13(3), 219–237 (2005)

    Article  Google Scholar 

  4. Gray, J., Szalay, A.S.: Where the Rubber Meets the Sky: Bridging the Gap between Databases and Science. IEEE Data Eng. Bull. 27(4), 3–11 (2004)

    Google Scholar 

  5. Hill, D.J., Minsker, B., Liu, Y., Myers, J.: End-to-End Cyberinfrastructure for Real-Time Environmental Decision Support. In: IEEE eScience (2008)

    Google Scholar 

  6. JackBe, http://www.jackbe.com

  7. Jhingran, A.: Enterprise Information Mashups: Integrating Information, Simply. In: 32nd International Conference on Very Large Data Bases (2006)

    Google Scholar 

  8. The Kepler Project, http://kepler-project.org

  9. Marini, L., Kooper, R., Bajcsy, P., Myers, J.D.: Publishing Active Workflows to Problem-Focused Web Spaces. In: IEEE eScience (2008)

    Google Scholar 

  10. The Matplotlib Library, http://matplotlib.sourceforge.net

  11. Mint.com, http://mint.com

  12. Microsoft Popfly, http://www.popfly.com

  13. Collaborative Research on Oregon Ocean Salmon project (ProjectCROOS), http://projectcroos.com

  14. Silva, C.: VisTrail, personal communication (2008)

    Google Scholar 

  15. SnapTweet, http://snaptweet.com/

  16. The Taverna Project, http://taverna.sourceforge.net

  17. TurboTax, http://turbotax.intuit.com

  18. The VisTrails Project, http://www.vistrails.org

  19. Santos, E., Freire, J., Silva, C.: Using Workflow Medleys to Streamline Exploratory Tasks, http://www.research.ibm.com/gvss/2007/presentations/emanuele_ibm_gvss2007.pdf

  20. Yahoo Pipes, http://pipes.yahoo.com/pipes

  21. Zhang, Y.L., Baptista, A.M.: SELFE: A semi-implicit Eulerian-Lagrangian finite-element model for cross-scale ocean circulation. Ocean Modelling 21(3-4), 71–96 (2008)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Howe, B., Green-Fishback, H., Maier, D. (2009). Scientific Mashups: Runtime-Configurable Data Product Ensembles. In: Winslett, M. (eds) Scientific and Statistical Database Management. SSDBM 2009. Lecture Notes in Computer Science, vol 5566. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-02279-1_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-02279-1_3

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-02278-4

  • Online ISBN: 978-3-642-02279-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics