Synonyms
Piecewise-constant approximations
Definition
A B-bucket histogram of length N is a partition of the set [0 , N) of N integers into intervals [b0 , b1) ∪ [b1 , b2) ∪ … ∪ [bB − 1 , bB), where b0 = 0 and bB = N, together with a collection of B heights hj, for 0 ≤ j < B, one for each bucket. On point query i, the histogram answer is hj, where j is the index of the interval (or “bucket”) containing i; that is, the unique j with bj ≤ i < bj + 1. In vector notation, χS is the vector that is 1 on the set S and zero elsewhere and the answer vector of a histogram is \( \overrightarrow{H}={\displaystyle {\sum}_{0\le j<B^h_j}{\chi}_{\left[{b}_j,{b}_{j+1}\right).}} \)
A histogram, \( \overrightarrow{H} \), is often used to approximate some other function, \( \overrightarrow{A} \), on [0 , N). In building a B-bucket histogram, it is desirable to choose B − 1 boundaries bj and B heights hj that tend to minimize some distance, e.g., the sum square error \( {\left\Vert...
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Recommended Reading
Cormode G, Muthukrishnan S. An improved data stream summary: the count-min sketch and its applications. In: Proceedings of the 6th Latin American Symposium Theoretical Informatics; 2004, p. 29–38.
Gilbert A, Guha S, Indyk P, Kotidis Y, Muthukrishnan S, Strauss M. Fast, small-space algorithms for approximate histogram maintenance. In: Proceedings of the 34th Annual ACM Symposium on Theory of Computing; 2002. p. 389–98.
Guha S, Koudas N, Shim K. Approximation and streaming algorithms for histogram construction problems. ACM Trans Database Syst. 2006;31(1):396–438.
Ioannidis Y. The history of histograms (abridged). In: Proceedings of the 29th International Conference on Very Large Data Bases; 2003, p. 19–30.
Jagadish H, Koudas N, Muthukrishnan S, Poosala V, Sevcik K, Suel T. Optimal histograms with quality guarantees. In: Proceedings of the 24th International Conference on Very Large Data Bases; 1998, p. 275–86.
Muthukrishnan S, Strauss M. Approximate histogram and wavelet summaries of streaming data. In: Data-stream management – processing high-speed data streams. New York: Springer; 2009 (Data-Centric Systems and Applications Series).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Section Editor information
Rights and permissions
Copyright information
© 2018 Springer Science+Business Media, LLC, part of Springer Nature
About this entry
Cite this entry
Strauss, M.J. (2018). Histograms on Streams. In: Liu, L., Özsu, M.T. (eds) Encyclopedia of Database Systems. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-8265-9_191
Download citation
DOI: https://doi.org/10.1007/978-1-4614-8265-9_191
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-8266-6
Online ISBN: 978-1-4614-8265-9
eBook Packages: Computer ScienceReference Module Computer Science and Engineering