Abstract
Working with Jim Gray, we set out more than 20 years ago to design and build the archive for the Sloan Digital Sky Survey (SDSS), the SkyServer. The SDSS project collected a huge data set over a large fraction of the Northern Sky and turned it into an open resource for the world’s astronomy community. Over the years the project has changed astronomy. Now the project is faced with the problem of how to ensure that the data will be preserved and kept alive for active use for another 15 to 20 years. At the time there were very few examples to learn from and we had to invent much of the system ourselves. The paper discusses the lessons learned, future directions and recalls some memorable moments of our collaboration.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
SDSS SkyServer. http://skyserver.sdss.org/
Szalay, A.S., Kunszt, P. Thakar, A., Gray, J., Slutz, D., Brunner, R.: Designing and mining multi-terabyte astronomy archives: the sloan digital sky survey. In: Proceedings of the SIGMOD 2000 Conference, pp. 451–462 (2000)
Singh, V., et al.: SkyServer traffic report – the first five years. Microsoft Technical report, MSR-TR-2006-190 (2006)
Michael Banks: Impact of Sky Surveys. Physics World, p. 10, March 2009
Madrid, J.P., Macchetto, F.D.: High-impact astronomical observatories. Bull. Am. Astron. Soc. 41, 913–914 (2009)
Frogel, J.A.: Astronomy’s greatest hits: the 100 most cited papers in each year of the first decade of the 21st century (2000–2009). Publ. Astron. Soc. Pac. 122(896), 1214–1235 (2010)
Aihara, H., Allende Prieto, C., An, D., et al.: The eighth data release of the Sloan Digital Sky Survey: first data from SDSS-III. Astrophys. J. Suppl. 193, 29 (2011)
Thakar, A.R., Szalay, A., Fekete, G., Gray, J.: The catalog archive server database management system. Comput. Sci. Eng. 10, 30 (2008)
Lupton, R.H., Gunn, J.E., Szalay, A.S.: A modified magnitude system that produces well-behaved magnitudes, colors, and errors even for low signal-to-noise ratio measurements. Astron. J. 118, 1406L (1999)
O’Mullane, W., Gray, J., Li, N., Budavari, T., Nieto-Santisteban, M., Szalay, A.S.: Batch query system with interactive local storage for SDSS and the VO. In: Ochsenbein, F., Allen, M., Egret, D. (eds.) Proceedings of the ADASS XIII, ASP Conference Series, vol. 314, p. 372 (2004)
Lintott, C.J., et al.: Galaxy Zoo: morphologies derived from visual inspection of galaxies from the Sloan Digital Sky Survey. Mon. Not. R. Astron. Soc. 389, 1179–1189 (2008). https://doi.org/10.1111/j.1365-2966.2008.13689.x
Gray, J., Szalay, A.S., Fekete, G.: Using table valued functions in SQL server 2005 to implement a spatial data library. MSR-TR-2005-122 (2005)
Fekete, G., Szalay, A.S., Gray, J.: Using table valued functions in SQL server 2005. MSDN Development Forum (2006)
Budavári, T., Szalay, A.S., Fekete, G.: Searchable sky coverage of astronomical observations: footprints and exposures. Publ. Astron. Soc. Pac. 122, 1375–1388 (2010)
Samet, H.: Foundations of Multidimensional and Metric Data Structures. Morgan-Kaufman, Burlington (2006)
Lemson, G., Budavari, T., Szalay, A.S.: Implementing a general spatial indexing library for relational databases of large numerical simulations. In: Proceedings of the SSDBMS Conference, Portland OR (2011)
Dobos, L., et al.: Array requirements for scientific applications and an implementation for microsoft SQL server. In: EDBT/ICDT Workshop on Array Databases, Uppsala, Sweden (2011)
Szalay, A., Thakar, A.R., Gray, J.: The sqlLoader data-loading pipeline. Comput. Sci. Eng. 10, 38 (2008)
Li, Y., et al.: A public turbulence database cluster and applications to study Lagrangian evolution of velocity increments in turbulence. J. Turbul. 9(31), 1–29 (2008)
JHU IDIES Turbulence Database. http://turbulence.pha.jhu.edu/
McNutt, T., Nabhani, T., Szalay, A., Deweese, T., Wong, J.: Oncospace: EScience technology and opportunities for oncology. Med. Phys. 35, 2900 (2008)
Szlavecz, K., et al.: Life under your feet: an end-to-end soil ecology sensor network, database, web server, and analysis service. Microsoft Technical report, MSR-TR-2006-90 (2006). http://lifeunderyourfeet.org/
Millennium Simulation Database. http://gavo.mpa-garching.mpg.de/Millennium/
Perlman, E., Burns, R., Li, Y., Meneveau, C.: Data exploration of turbulence simulations using a database cluster, In: Proceedings of the Supercomputing Conference (SC 2007) (2007)
Li, Y., et al.: A public turbulence database and applications to study Lagrangian evolution of velocity increments in turbulence. J. Turbulence 9, N31 (2008)
Yu, H., et al.: Studying Lagrangian dynamics of turbulence using on-demand fluid particle tracking in a public turbulence database. J. Turbulence 13, N12 (2012)
Eyink, G., et al.: Flux-freezing breakdown in high-conductivity magnetohydrodynamic turbulence. Nature 497, 466 (2013)
Kanov, K., Burns, R., Eyink, G., Meneveau, C., Szalay, A.: Data-intensive spatial filtering in large numerical simulation datasets. In: The International Conference for High Performance Computing, Networking, Storage, and Analysis, Supercomputing SC 2012 (2012, submitted)
The Millennium XXL Project. http://wwwmpa.mpa-garching.mpg.de/mpa/research/current_research/hl2011-9/hl2011-9-en.html
Mitzenmacher, M.: A model for learned bloom filters and related structures. arXiv:1802.00844 [cs.DS] (2018)
Kraska, T., Beutel, A., Chi, Ed.H., Dean, J., Polyzotis, N.: The case for learned index structures. arXiv:1712.021208 [cs.DB] (2017)
Ortiz, J., Balazinska, M., Gehrke, J., Kehrti, S.S.: Learning state representations for query optimization with deep reinforcement learning. arXiv:1803.08604 [cs.DB] (2018)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Szalay, A.S. (2018). Database-Centric Scientific Computing. In: Benczúr, A., Thalheim, B., Horváth, T. (eds) Advances in Databases and Information Systems. ADBIS 2018. Lecture Notes in Computer Science(), vol 11019. Springer, Cham. https://doi.org/10.1007/978-3-319-98398-1_1
Download citation
DOI: https://doi.org/10.1007/978-3-319-98398-1_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-98397-4
Online ISBN: 978-3-319-98398-1
eBook Packages: Computer ScienceComputer Science (R0)