Encyclopedia of Database Systems

2018 Edition
| Editors: Ling Liu, M. Tamer Özsu


  • Arie ShoshaniEmail author
Reference work entry
DOI: https://doi.org/10.1007/978-1-4614-8265-9_380


Statistical correctness; Summarization correctness


Summarizability is a property that assures the correctness of summary operations over On-Line Analytical Processing (OLAP) databases, which are akin to Statistical Databases [10]. Such databases are generally referred to as “summary databases,” and have a data model based on one or more measures defined over the cross product of dimensions. For example, a bookstore company may have multiple stores in many cities. Assume that there is a database containing the stores’ revenues for books sold per day over the last 3 years. In such a database, “revenue” is a measure, and “book,” “store,” “day” are the dimensions that define the cross product over which the measure revenue is defined. A dimension in a summary database is said to be summarizablerelative to a measure, if a summary statistic (sum, average, etc.) applied over the dimension produces correct results. For example, if summarization over all the books sold to...

This is a preview of subscription content, log in to check access.

Recommended Reading

  1. 1.
    Agrawal R, Gupta A, Sarawagi S. Modeling multidimensional databases. In: Proceedings of the 13th International Conference on Data Engineering; 1997. p. 232–43.Google Scholar
  2. 2.
    Chan P, Shoshani A. Subject: a directory driven system for organizing and accessing large statistical databases. In: Proceedings of the 7th International Conference on Very Data Bases; 1981. p. 553–63.Google Scholar
  3. 3.
    Codd EF, Codd SB, Salley CT. Providing olap (online analytical processing) to user-analysts: an IT mandate, Codd and Associates technical report; 1993.Google Scholar
  4. 4.
    Gray J, Bosworth A, Layman A, Pirahesh H. Data cube: a relational aggregation operator generalizing group-by, cross-tabs and sub-totals. In: Proceedings of the 12th International Conference on Data Engineering; 1996. p. 152–9.Google Scholar
  5. 5.
    Hurtado CA, Mendelzon AO. Reasoning about summarizability in heterogeneous multidimensional schemas. In: Proceedings of the 8th International Conference on Database Theory; 2001. p. 375–89.CrossRefGoogle Scholar
  6. 6.
    Hurtado CA, Gutiérrez C, Mendelzon A. Capturing summarizability with integrity constraints in OLAP. ACM Trans Database Syst. 2005;30(3):854–86.CrossRefGoogle Scholar
  7. 7.
    Lenz H-J, Shoshani A. Summarizability in OLAP and statistical data bases. In: Proceedings of the 9th International Conference on Scientific and Statistical Database Management; 1997. p. 132–43.Google Scholar
  8. 8.
    Pedersen TB, Jensen CS. Multidimensional data modeling for complex data. In: Proceedings of the 15th International Conference on Data Engineering; 1999. p. 336–45.Google Scholar
  9. 9.
    Rafanelli M, Shoshani A. STORM: a statistical object representation model. In: Proceedings of the 2nd International Conference on Scientific and Statistical Database Management; 1990. p. 14–29.Google Scholar
  10. 10.
    Shoshani A. OLAP and statistical databases: similarities and differences. In: Proceedings of the 16th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems; 1997. p. 185–96.Google Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Lawrence Berkeley National LaboratoryBerkeleyUSA

Section editors and affiliations

  • Torben Bach Pedersen
    • 1
  • Stefano Rizzi
    • 2
  1. 1.Department of Computer ScienceAalborg UniversityAalborgDenmark
  2. 2.DISIUniv. of BolognaBolognaItaly