Encyclopedia of Database Systems

2018 Edition
| Editors: Ling Liu, M. Tamer Özsu

Managing Compressed Structured Text

  • Nieves R. Brisaboa
  • Ana Cerdeira-Pena
  • Gonzalo Navarro
Reference work entry
DOI: https://doi.org/10.1007/978-1-4614-8265-9_72

Synonyms

Compressing XML; Searching compressed XML

Definition

Compressing structured text is the problem of creating a reduced-space representation from which the original data can be re-created exactly. Compared to plain text compression, the goal is to take advantage of the structural properties of the data. A more ambitious goal is that of being able of manipulating this text in compressed form, without decompressing it. This entry focuses on compressing, navigating, and searching structured text, as those are the areas where more advances have been made.

Historical Background

Modeling data using structured text has been a topic of interest at least since the 1980s, with a significant burst of activity in the 1990s [3]. Since then, the widespread adoption of XML (appearing in 1998, see the current version at http://www.w3.org/TR/xml) as the standard to represent structured text has unified the efforts of the community around this particular format. Very early, however, the same...

This is a preview of subscription content, log in to check access.

Recommended Reading

  1. 1.
    Arroyuelo D, Cánovas R, Navarro G, Sadakane K. Succinct trees in practice. In: Proceedings of the 11th Workshop on Algorithm Engineering and Experiments; 2009. p. 84–97.CrossRefGoogle Scholar
  2. 2.
    Arroyuelo D, Claude F, Maneth S, Mäkinen V, Navarro G, Nguyen K, Sirén J, Välimäki N. Fast in-memory XPath search using compressed indexes. In: Proceedings of the 26th International Conference on Data Engineering; 2010. p. 417–28.Google Scholar
  3. 3.
    Baeza-Yates R, Navarro G. Integrating contents and structure in text retrieval. ACM SIGMOD Rec. 1996;25(1):67–79.CrossRefGoogle Scholar
  4. 4.
    Barbay J, Claude F, Gagie T, Navarro G, Nekrich Y. Efficient fully-compressed sequence representations. Algorithmica. 2014;69(1):232–68.MathSciNetzbMATHCrossRefGoogle Scholar
  5. 5.
    Brisaboa NR, Fariña A, Navarro G, Paramá JR. Lightweight natural language text compression. Inf Retr. 2007;10(1):1–33.CrossRefGoogle Scholar
  6. 6.
    Brisaboa NR, Cerdeira-Pena A, Navarro G. XXS: efficient XPath evaluation on compressed XML documents. ACM Trans Inf Syst. 2014; 32(3):13.CrossRefGoogle Scholar
  7. 7.
    Cerdeira-Pena A. Compressed self-indexed XML representation with efficient XPath evaluation. PhD thesis, Department of Computer Science, University of A Coruña, 2013.Google Scholar
  8. 8.
    Ferragina P, Manzini G. Indexing compressed text. J ACM. 2005;52(4):552–81.MathSciNetzbMATHCrossRefGoogle Scholar
  9. 9.
    Ferragina P, Luccio F, Manzini G, Muthukrishnan S. Compressing and indexing labeled trees, with applications. J ACM. 2009;57(1):4:1–4:33.MathSciNetzbMATHCrossRefGoogle Scholar
  10. 10.
    Gottlob G, Koch C, Pichler R. Efficient algorithms for processing XPath queries. ACM Trans Database Syst. 2005;30(2):444–91.CrossRefGoogle Scholar
  11. 11.
    Lohrey M, Maneth S, Mennicke R. The complexity of tree automata and XPath on grammar-compressed trees. Theor Comput Sci. 2006;363(2): 196–210.MathSciNetzbMATHCrossRefGoogle Scholar
  12. 12.
    Lohrey M, Maneth S, Mennicke R. XML tree structure compression using RePair. Inf Syst. 2013;38(8):1150–67.CrossRefGoogle Scholar
  13. 13.
    Mäkinen V, Navarro G, Sirén J, Välimäki N. Storage and retrieval of highly repetitive sequence collections. J Comput Biol. 2010;17(3):281–308.MathSciNetCrossRefGoogle Scholar
  14. 14.
    Navarro G, Mäkinen V. Compressed full-text indexes. ACM Comput Surv. 2007;39(1):2.zbMATHCrossRefGoogle Scholar
  15. 15.
    Navarro G, Ordóñez A. Faster compressed suffix trees for repetitive text collections. In: Proceedings of the 13th International Symposium on Experimental Algorithms; 2014. p. 424–35.Google Scholar
  16. 16.
    Navarro G, Ordóñez A. Grammar compressed sequences with rank/select support. In: Proceedings of the 21st International Symposium on String Processing and Information Retrieval; 2014.Google Scholar
  17. 17.
    Sakr S. XML compression techniques: a survey and comparison. J Comput Syst Sci. 2009;75(5): 303–22.MathSciNetzbMATHCrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  • Nieves R. Brisaboa
    • 1
  • Ana Cerdeira-Pena
    • 1
  • Gonzalo Navarro
    • 2
  1. 1.Database Laboratory, Department of Computer ScienceUniversity of A CoruñaA CoruñaSpain
  2. 2.Department of Computer ScienceUniversity of ChileSantiagoChile