Skip to main content

Database-Centric Scientific Computing

(In Memoriam Jim Gray)

  • Conference paper
  • First Online:
  • 801 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11019))

Abstract

Working with Jim Gray, we set out more than 20 years ago to design and build the archive for the Sloan Digital Sky Survey (SDSS), the SkyServer. The SDSS project collected a huge data set over a large fraction of the Northern Sky and turned it into an open resource for the world’s astronomy community. Over the years the project has changed astronomy. Now the project is faced with the problem of how to ensure that the data will be preserved and kept alive for active use for another 15 to 20 years. At the time there were very few examples to learn from and we had to invent much of the system ourselves. The paper discusses the lessons learned, future directions and recalls some memorable moments of our collaboration.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. SDSS SkyServer. http://skyserver.sdss.org/

  2. Szalay, A.S., Kunszt, P. Thakar, A., Gray, J., Slutz, D., Brunner, R.: Designing and mining multi-terabyte astronomy archives: the sloan digital sky survey. In: Proceedings of the SIGMOD 2000 Conference, pp. 451–462 (2000)

    Google Scholar 

  3. Singh, V., et al.: SkyServer traffic report – the first five years. Microsoft Technical report, MSR-TR-2006-190 (2006)

    Google Scholar 

  4. Michael Banks: Impact of Sky Surveys. Physics World, p. 10, March 2009

    Google Scholar 

  5. Madrid, J.P., Macchetto, F.D.: High-impact astronomical observatories. Bull. Am. Astron. Soc. 41, 913–914 (2009)

    Google Scholar 

  6. Frogel, J.A.: Astronomy’s greatest hits: the 100 most cited papers in each year of the first decade of the 21st century (2000–2009). Publ. Astron. Soc. Pac. 122(896), 1214–1235 (2010)

    Article  Google Scholar 

  7. Aihara, H., Allende Prieto, C., An, D., et al.: The eighth data release of the Sloan Digital Sky Survey: first data from SDSS-III. Astrophys. J. Suppl. 193, 29 (2011)

    Article  Google Scholar 

  8. Thakar, A.R., Szalay, A., Fekete, G., Gray, J.: The catalog archive server database management system. Comput. Sci. Eng. 10, 30 (2008)

    Article  Google Scholar 

  9. Lupton, R.H., Gunn, J.E., Szalay, A.S.: A modified magnitude system that produces well-behaved magnitudes, colors, and errors even for low signal-to-noise ratio measurements. Astron. J. 118, 1406L (1999)

    Article  Google Scholar 

  10. O’Mullane, W., Gray, J., Li, N., Budavari, T., Nieto-Santisteban, M., Szalay, A.S.: Batch query system with interactive local storage for SDSS and the VO. In: Ochsenbein, F., Allen, M., Egret, D. (eds.) Proceedings of the ADASS XIII, ASP Conference Series, vol. 314, p. 372 (2004)

    Google Scholar 

  11. Lintott, C.J., et al.: Galaxy Zoo: morphologies derived from visual inspection of galaxies from the Sloan Digital Sky Survey. Mon. Not. R. Astron. Soc. 389, 1179–1189 (2008). https://doi.org/10.1111/j.1365-2966.2008.13689.x

    Article  Google Scholar 

  12. Gray, J., Szalay, A.S., Fekete, G.: Using table valued functions in SQL server 2005 to implement a spatial data library. MSR-TR-2005-122 (2005)

    Google Scholar 

  13. Fekete, G., Szalay, A.S., Gray, J.: Using table valued functions in SQL server 2005. MSDN Development Forum (2006)

    Google Scholar 

  14. Budavári, T., Szalay, A.S., Fekete, G.: Searchable sky coverage of astronomical observations: footprints and exposures. Publ. Astron. Soc. Pac. 122, 1375–1388 (2010)

    Article  Google Scholar 

  15. Samet, H.: Foundations of Multidimensional and Metric Data Structures. Morgan-Kaufman, Burlington (2006)

    MATH  Google Scholar 

  16. Lemson, G., Budavari, T., Szalay, A.S.: Implementing a general spatial indexing library for relational databases of large numerical simulations. In: Proceedings of the SSDBMS Conference, Portland OR (2011)

    Google Scholar 

  17. Dobos, L., et al.: Array requirements for scientific applications and an implementation for microsoft SQL server. In: EDBT/ICDT Workshop on Array Databases, Uppsala, Sweden (2011)

    Google Scholar 

  18. Szalay, A., Thakar, A.R., Gray, J.: The sqlLoader data-loading pipeline. Comput. Sci. Eng. 10, 38 (2008)

    Article  Google Scholar 

  19. Li, Y., et al.: A public turbulence database cluster and applications to study Lagrangian evolution of velocity increments in turbulence. J. Turbul. 9(31), 1–29 (2008)

    MATH  Google Scholar 

  20. JHU IDIES Turbulence Database. http://turbulence.pha.jhu.edu/

  21. McNutt, T., Nabhani, T., Szalay, A., Deweese, T., Wong, J.: Oncospace: EScience technology and opportunities for oncology. Med. Phys. 35, 2900 (2008)

    Article  Google Scholar 

  22. Szlavecz, K., et al.: Life under your feet: an end-to-end soil ecology sensor network, database, web server, and analysis service. Microsoft Technical report, MSR-TR-2006-90 (2006). http://lifeunderyourfeet.org/

  23. Millennium Simulation Database. http://gavo.mpa-garching.mpg.de/Millennium/

  24. Perlman, E., Burns, R., Li, Y., Meneveau, C.: Data exploration of turbulence simulations using a database cluster, In: Proceedings of the Supercomputing Conference (SC 2007) (2007)

    Google Scholar 

  25. Li, Y., et al.: A public turbulence database and applications to study Lagrangian evolution of velocity increments in turbulence. J. Turbulence 9, N31 (2008)

    Article  Google Scholar 

  26. Yu, H., et al.: Studying Lagrangian dynamics of turbulence using on-demand fluid particle tracking in a public turbulence database. J. Turbulence 13, N12 (2012)

    Article  MathSciNet  Google Scholar 

  27. Eyink, G., et al.: Flux-freezing breakdown in high-conductivity magnetohydrodynamic turbulence. Nature 497, 466 (2013)

    Article  Google Scholar 

  28. Kanov, K., Burns, R., Eyink, G., Meneveau, C., Szalay, A.: Data-intensive spatial filtering in large numerical simulation datasets. In: The International Conference for High Performance Computing, Networking, Storage, and Analysis, Supercomputing SC 2012 (2012, submitted)

    Google Scholar 

  29. The Millennium XXL Project. http://wwwmpa.mpa-garching.mpg.de/mpa/research/current_research/hl2011-9/hl2011-9-en.html

  30. Mitzenmacher, M.: A model for learned bloom filters and related structures. arXiv:1802.00844 [cs.DS] (2018)

  31. Kraska, T., Beutel, A., Chi, Ed.H., Dean, J., Polyzotis, N.: The case for learned index structures. arXiv:1712.021208 [cs.DB] (2017)

  32. Ortiz, J., Balazinska, M., Gehrke, J., Kehrti, S.S.: Learning state representations for query optimization with deep reinforcement learning. arXiv:1803.08604 [cs.DB] (2018)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alexander S. Szalay .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Szalay, A.S. (2018). Database-Centric Scientific Computing. In: Benczúr, A., Thalheim, B., Horváth, T. (eds) Advances in Databases and Information Systems. ADBIS 2018. Lecture Notes in Computer Science(), vol 11019. Springer, Cham. https://doi.org/10.1007/978-3-319-98398-1_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-98398-1_1

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-98397-4

  • Online ISBN: 978-3-319-98398-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics