Hierarchical Data Summarization

Tanin, Egemen; Ali, Mohammed Eunus

doi:10.1007/978-1-4614-8265-9_536

Hierarchical Data Summarization

Egemen Tanin³ &
Mohammed Eunus Ali⁴

Reference work entry
First Online: 01 January 2018

22 Accesses

Synonyms

Hierarchical data summarization

Definition

Given a set of records, data summaries on different attributes are frequently produced in data management systems. Commonly used examples are the number of records that fall into a set of ranges of an attribute or the minimum values in these ranges. To improve the efficiency in accessing summaries at different resolutions or due to a direct need for investigating a hierarchy that is inherent to the data type, such as dates, hierarchical versions of data summaries can be used. A data structure or algorithm is labeled as hierarchical if that structure or algorithm uses the concept of subcomponents to systematically obtain conceptually larger components. The method of obtaining a larger component is regularly induced by the user’s understanding of the domain as well as the fact that hierarchies can also be created automatically by a set of rules embedded into the system. Thus, rules used in a data structure’s creation, e.g., B+-trees,...

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 4,499.99; Price excludes VAT (USA)

Hardcover Book: USD 6,499.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Recommended Reading

Aboulnaga A, Aref WG. Window query processing in linear quadtrees. Distrib Parallel Databases. 2001;10(2):111–26.
Article MATH Google Scholar
Ahmad Y, Nath S. Colr-tree: communication-efficient spatio-temporal indexing for a sensor data web portal. In: Proceedings of the 24th International Conference on Data Engineering; 2008. p. 784–93.
Google Scholar
Ali ME, Zhang R, Tanin E, Kulik L. A motion-aware approach to continuous retrieval of 3D objects. In: Proceedings of the 24th International Conference on Data Engineering; 2008. p. 843–52.
Google Scholar
Antoshenkov G. Query processing in DEC RDB: major issues and future challenges. IEEE Data Eng Bull 1993;16(4):42–5.
Google Scholar
Aoki PM. Generalizing “search” in generalized search trees. In: Proceedings of the 14th International Conference on Data Engineering; 1998. p. 380–9.
Google Scholar
Bruno N, Chaudhuri S, Gravano L. STHoles: a multidimensional workload-aware histogram. SIGMOD Rec. 2001;30(2):211–22.
Article Google Scholar
Camerra A, Palpanas T, Shieh J, Keogh E. isax 2.0: indexing and mining one billion time series. In: Proceedings of the 10th IEEE International Conference on Data Mining; 2010. p. 58–67.
Google Scholar
Dean J, Ghemawat S. Mapreduce: simplified data processing on large clusters. Commun ACM. 2008;51(1):107–13.
Article Google Scholar
Ganesan D, Estrin D, Heidemann J. Dimensions: why do we need a new data handling architecture for sensor networks? In: Proceedings of the ACM Workshop on Hot Topics in Networks; 2002.
Google Scholar
Gao J, Guibas LJ, Hershberger J, Zhang L. Fractionally cascaded information in a sensor network. In: Proceedings of the 3rd International Symposium on Information Processing in Sensor Networks; 2004. p. 311–9.
Google Scholar
Greenstein B, Estrin D, Govindan R, Ratnasamy S, Shenker S. DIFS: a distributed index for features in sensor networks. In: Proceedings of the IEEE Workshop on Sensor Network Protocols and Applications; 2003.
Google Scholar
Hellerstein JM, Naughton JF, Pfeffer A. Generalized search trees for database systems. In: Proceedings of the 21th International Conference on Very Large Data Bases; 1995. p. 562–73.
Google Scholar
Keogh E, Chakrabarti K, Pazzani M, Mehrotra S. Dimensionality reduction for fast similarity search in large time series databases. J Knowl Inf Syst. 2000;3(3):263–86.
Article MATH Google Scholar
Kitsos I, Magoutis K, Tzitzikas Y. Scalable entity-based summarization of web search results using mapreduce. Distrib Parallel Databases 2014;32(3):405–46.
Article Google Scholar
Knuth DE. Sorting and searching, the art of computer programming, vol. 3. Redwood City: Addison Wesley Publishing; 1973.
MATH Google Scholar
Li X, Kim YJ, Govindan R, Hong W. Multi-dimensional range queries in sensor networks. In: Proceedings of the 1st International Conference on Embedded Networked Sensor Systems; 2003. p. 5–7.
Google Scholar
Madden SR, Franklin MJ, Hellerstein JM, Hong W. TinyDB: an acquisitional query processing system for sensor networks. ACM Trans Database Syst. 2005;30(1):122–73.
Article Google Scholar
Nath S, Gibbons PB, Seshan S, Anderson ZR. Synopsis diffusion for robust aggregation in sensor networks. In: Proceedings of the 2nd International Conference on Embedded Networked Sensor Systems; 2004. p. 250–62.
Google Scholar
Ordonez C, Mohanam N, Garcia-Alvarado C. PCA for large data sets with parallel data summarization. Distrib Parallel Databases. 2014;32(3): 377–403.
Article Google Scholar
Ratnasamy S, Francis P, Handley M, Karp RM, Shenker S. A scalable content-addressable network. In: Proceedings of the 2001 Conference on Applications, Technologies, Architectures, and Protocols for Computer Communication; 2001. p. 161–72.
Google Scholar
Reiss F, Garofalakis M, Hellerstein JM. Compact histograms for hierarchical identifiers. In: Proceedings of the 32nd International Conference on Very Large Data Bases; 2006. p. 870–81.
Google Scholar
Samet H, Sankaranarayanan J, Auerbach M. Indexing methods for moving object databases: games and other applications. In: Proceedings of the ACM SIGMOD International Conference on Management of Data; 2013. p. 169–80.
Google Scholar
Wang J, Wu S, Gao H, Li J, Ooi BC. Indexing multi-dimensional data in a cloud system. In: Proceedings of the ACM SIGMOD International Conference on Management of Data; 2010. p. 591–602.
Google Scholar
Wang W, Yang J, Muntz R. STING: a statistical information grid approach to spatial data mining. In: Proceedings of the 23th International Conference on Very Large Data Bases; 1997. p. 186–95.
Google Scholar
Wu S, Jiang D, Ooi BC, Wu K-L. Efficient b-tree based indexing for cloud data processing. Proc VLDB Endowment. 2010;3(1):1207–18.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Computing and Information Systems, University of Melbourne, Melbourne, VIC, Australia
Egemen Tanin
Department of Computer Science and Engineering, Bangladesh University of Engineering and Technology (BUET), Dhaka, Bangladesh
Mohammed Eunus Ali

Authors

Egemen Tanin
View author publications
You can also search for this author in PubMed Google Scholar
Mohammed Eunus Ali
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Egemen Tanin .

Editor information

Editors and Affiliations

Georgia Institute of Technology College of Computing, Atlanta, GA, USA
Ling Liu
University of Waterloo School of Computer Science, Waterloo, ON, Canada
M. Tamer Özsu

Rights and permissions

Reprints and permissions

Copyright information

About this entry

Cite this entry

Tanin, E., Ali, M.E. (2018). Hierarchical Data Summarization. In: Liu, L., Özsu, M.T. (eds) Encyclopedia of Database Systems. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-8265-9_536

Download citation

DOI: https://doi.org/10.1007/978-1-4614-8265-9_536
Published: 07 December 2018
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-8266-6
Online ISBN: 978-1-4614-8265-9
eBook Packages: Computer ScienceReference Module Computer Science and Engineering

Publish with us

Policies and ethics