Skip to main content

Hierarchical Data Summarization

  • Reference work entry
  • First Online:
  • 22 Accesses

Synonyms

Hierarchical data summarization

Definition

Given a set of records, data summaries on different attributes are frequently produced in data management systems. Commonly used examples are the number of records that fall into a set of ranges of an attribute or the minimum values in these ranges. To improve the efficiency in accessing summaries at different resolutions or due to a direct need for investigating a hierarchy that is inherent to the data type, such as dates, hierarchical versions of data summaries can be used. A data structure or algorithm is labeled as hierarchical if that structure or algorithm uses the concept of subcomponents to systematically obtain conceptually larger components. The method of obtaining a larger component is regularly induced by the user’s understanding of the domain as well as the fact that hierarchies can also be created automatically by a set of rules embedded into the system. Thus, rules used in a data structure’s creation, e.g., B+-trees,...

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   4,499.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD   6,499.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Recommended Reading

  1. Aboulnaga A, Aref WG. Window query processing in linear quadtrees. Distrib Parallel Databases. 2001;10(2):111–26.

    Article  MATH  Google Scholar 

  2. Ahmad Y, Nath S. Colr-tree: communication-efficient spatio-temporal indexing for a sensor data web portal. In: Proceedings of the 24th International Conference on Data Engineering; 2008. p. 784–93.

    Google Scholar 

  3. Ali ME, Zhang R, Tanin E, Kulik L. A motion-aware approach to continuous retrieval of 3D objects. In: Proceedings of the 24th International Conference on Data Engineering; 2008. p. 843–52.

    Google Scholar 

  4. Antoshenkov G. Query processing in DEC RDB: major issues and future challenges. IEEE Data Eng Bull 1993;16(4):42–5.

    Google Scholar 

  5. Aoki PM. Generalizing “search” in generalized search trees. In: Proceedings of the 14th International Conference on Data Engineering; 1998. p. 380–9.

    Google Scholar 

  6. Bruno N, Chaudhuri S, Gravano L. STHoles: a multidimensional workload-aware histogram. SIGMOD Rec. 2001;30(2):211–22.

    Article  Google Scholar 

  7. Camerra A, Palpanas T, Shieh J, Keogh E. isax 2.0: indexing and mining one billion time series. In: Proceedings of the 10th IEEE International Conference on Data Mining; 2010. p. 58–67.

    Google Scholar 

  8. Dean J, Ghemawat S. Mapreduce: simplified data processing on large clusters. Commun ACM. 2008;51(1):107–13.

    Article  Google Scholar 

  9. Ganesan D, Estrin D, Heidemann J. Dimensions: why do we need a new data handling architecture for sensor networks? In: Proceedings of the ACM Workshop on Hot Topics in Networks; 2002.

    Google Scholar 

  10. Gao J, Guibas LJ, Hershberger J, Zhang L. Fractionally cascaded information in a sensor network. In: Proceedings of the 3rd International Symposium on Information Processing in Sensor Networks; 2004. p. 311–9.

    Google Scholar 

  11. Greenstein B, Estrin D, Govindan R, Ratnasamy S, Shenker S. DIFS: a distributed index for features in sensor networks. In: Proceedings of the IEEE Workshop on Sensor Network Protocols and Applications; 2003.

    Google Scholar 

  12. Hellerstein JM, Naughton JF, Pfeffer A. Generalized search trees for database systems. In: Proceedings of the 21th International Conference on Very Large Data Bases; 1995. p. 562–73.

    Google Scholar 

  13. Keogh E, Chakrabarti K, Pazzani M, Mehrotra S. Dimensionality reduction for fast similarity search in large time series databases. J Knowl Inf Syst. 2000;3(3):263–86.

    Article  MATH  Google Scholar 

  14. Kitsos I, Magoutis K, Tzitzikas Y. Scalable entity-based summarization of web search results using mapreduce. Distrib Parallel Databases 2014;32(3):405–46.

    Article  Google Scholar 

  15. Knuth DE. Sorting and searching, the art of computer programming, vol. 3. Redwood City: Addison Wesley Publishing; 1973.

    MATH  Google Scholar 

  16. Li X, Kim YJ, Govindan R, Hong W. Multi-dimensional range queries in sensor networks. In: Proceedings of the 1st International Conference on Embedded Networked Sensor Systems; 2003. p. 5–7.

    Google Scholar 

  17. Madden SR, Franklin MJ, Hellerstein JM, Hong W. TinyDB: an acquisitional query processing system for sensor networks. ACM Trans Database Syst. 2005;30(1):122–73.

    Article  Google Scholar 

  18. Nath S, Gibbons PB, Seshan S, Anderson ZR. Synopsis diffusion for robust aggregation in sensor networks. In: Proceedings of the 2nd International Conference on Embedded Networked Sensor Systems; 2004. p. 250–62.

    Google Scholar 

  19. Ordonez C, Mohanam N, Garcia-Alvarado C. PCA for large data sets with parallel data summarization. Distrib Parallel Databases. 2014;32(3): 377–403.

    Article  Google Scholar 

  20. Ratnasamy S, Francis P, Handley M, Karp RM, Shenker S. A scalable content-addressable network. In: Proceedings of the 2001 Conference on Applications, Technologies, Architectures, and Protocols for Computer Communication; 2001. p. 161–72.

    Google Scholar 

  21. Reiss F, Garofalakis M, Hellerstein JM. Compact histograms for hierarchical identifiers. In: Proceedings of the 32nd International Conference on Very Large Data Bases; 2006. p. 870–81.

    Google Scholar 

  22. Samet H, Sankaranarayanan J, Auerbach M. Indexing methods for moving object databases: games and other applications. In: Proceedings of the ACM SIGMOD International Conference on Management of Data; 2013. p. 169–80.

    Google Scholar 

  23. Wang J, Wu S, Gao H, Li J, Ooi BC. Indexing multi-dimensional data in a cloud system. In: Proceedings of the ACM SIGMOD International Conference on Management of Data; 2010. p. 591–602.

    Google Scholar 

  24. Wang W, Yang J, Muntz R. STING: a statistical information grid approach to spatial data mining. In: Proceedings of the 23th International Conference on Very Large Data Bases; 1997. p. 186–95.

    Google Scholar 

  25. Wu S, Jiang D, Ooi BC, Wu K-L. Efficient b-tree based indexing for cloud data processing. Proc VLDB Endowment. 2010;3(1):1207–18.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Egemen Tanin .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Science+Business Media, LLC, part of Springer Nature

About this entry

Check for updates. Verify currency and authenticity via CrossMark

Cite this entry

Tanin, E., Ali, M.E. (2018). Hierarchical Data Summarization. In: Liu, L., Özsu, M.T. (eds) Encyclopedia of Database Systems. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-8265-9_536

Download citation

Publish with us

Policies and ethics