Skip to main content

The Architecture of SciDB

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6809))

Abstract

SciDB is an open-source analytical database oriented toward the data management needs of scientists. As such it mixes statistical and linear algebra operations with data management ones, using a natural nested multidimensional array data model. We have been working on the code for two years, most recently with the help of venture capital backing. Release 11.06 (June 2011) is downloadable from our website (SciDB.org).

This paper presents the main design decisions of SciDB. It focuses on our decisions concerning a high-level, SQL-like query language, the issues facing our query optimizer and executor and efficient storage management for arrays. The paper also discusses implementation of features not usually present in DBMSs, including version control, uncertainty and provenance.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. http://arxiv.org/abs/0805.2366

  2. Becla, J., Lim, K.-T.: Report from the First Workshop on Extremely Large Databases. Data Science Journal 7 (2008)

    Google Scholar 

  3. Szalay, A.: Private communication

    Google Scholar 

  4. Branco, M., Cameron, D., Gaidioz, B., Garonne, V., Koblitz, B., Lassnig, M., Rocha, R., Salgado, P., Wenaus, T.: Managing ATLAS data on a petabyte-scale with DQ2. Journal of Physics: Conference Series 119 (2008)

    Google Scholar 

  5. Szalay, A.: The Sloan Digital Sky Survey and Beyond. In: SIGMOD Record (June 2008)

    Google Scholar 

  6. Cudre-Mauroux, P., et al.: A Demonstration of SciDB: a Science-oriented DBMS. VLDB 2(2), 1534–1537 (2009)

    Google Scholar 

  7. Becla, J., Lim, K.-T.: Report from the Second Workshop on Extremely Large Databases, http://www-conf.slac.stanford.edu/xldb08/ , http://www.jstage.jst.go.jp/article/dsj/7/0/1/_pdf

  8. Becla, J., Lim, K.-T.: Report from the Third Workshop on Extremely Large Databases, http://www-conf.slac.stanford.edu/xldb09/

  9. Becla, J., Lim, K.-T.: Report from the Fourth Workshop on Extremely Large Databases, http://www-conf.slac.stanford.edu/xldb10/

  10. Cudre-Maroux, P., et al.: SS-DB: A Standard Science DBMS Benchmark (submitted for publication)

    Google Scholar 

  11. http://www.hdfgroup.org/HDF5/

  12. http://en.wikipedia.org/wiki/APLprogramming_language

  13. http://en.wikipedia.org/wiki/Functional_programming

  14. http://kx.com/

  15. Stonebraker, M., Rowe, L.A., Hirohama, M.: The Implementation of POSTGRES. IEEE Transactions on Knowledge and Data Engineering 2(1), 125–142 (1990)

    Article  Google Scholar 

  16. http://developer.postgresql.org/docs/postgres/xaggr.html

  17. http://www.netlib.org/ScaLAPACK/

  18. Sarawagi, S., Stonebraker, M.: Efficient organization of large multidimensional arrays. In: ICDE, pp. 328–336 (1994), citeseer.ist.psu.edu/article/sarawagi94efficient.html

  19. Soroush, E., et al.: ArrayStore: A Storage Manager for Complex Parallel Array Processing. In: Proc. 2011 SIGMOD Conference (2011)

    Google Scholar 

  20. Seering, A., et al.: Efficient Versioning for Scientific Arrays (submitted for publication)

    Google Scholar 

  21. Mutsuzaki, M., Theobald, M., de Keijzer, A., Widom, J., Agrawal, P., Benjelloun, O., Das Sarma, A., Murthy, R., Sugihara, T.: Trio-One: Layering Uncertainty and Lineage on a Conventional DBMS. In: Proceedings of the 2007 CIDR Conference, Asilomar, CA (January 2007)

    Google Scholar 

  22. Wu, E., et al.: The SciDB Provenance System (in preparation)

    Google Scholar 

  23. Cohen, J., et al.: Mad Skills: New Analysis Practices for Big Data. In: Proc. 2009 VLDB Conference

    Google Scholar 

  24. http://www.r-project.org/

  25. http://monetdb.cwi.nl/

  26. van Ballegooij, A., Cornacchia, R., de Vries, A.P., Kersten, M.L.: Distribution Rules for Array Database Queries. In: Andersen, K.V., Debenham, J., Wagner, R. (eds.) DEXA 2005. LNCS, vol. 3588, pp. 55–64. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Stonebraker, M., Brown, P., Poliakov, A., Raman, S. (2011). The Architecture of SciDB. In: Bayard Cushing, J., French, J., Bowers, S. (eds) Scientific and Statistical Database Management. SSDBM 2011. Lecture Notes in Computer Science, vol 6809. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-22351-8_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-22351-8_1

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-22350-1

  • Online ISBN: 978-3-642-22351-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics