Skip to main content

Tree Compression with Top Trees Revisited

  • Conference paper
  • First Online:
Experimental Algorithms (SEA 2015)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9125))

Included in the following conference series:

Abstract

We revisit tree compression with top trees (Bille et al. [2]), and present several improvements to the compressor and its analysis. By significantly reducing the amount of information stored and guiding the compression step using a RePair-inspired heuristic, we obtain a fast compressor achieving good compression ratios, addressing an open problem posed by [2]. We show how, with relatively small overhead, the compressed file can be converted into an in-memory representation that supports basic navigation operations in worst-case logarithmic time without decompression. We also show a much improved worst-case bound on the size of the output of top-tree compression (answering an open question posed in a talk on this algorithm by Weimann in 2012).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Alstrup, S., Holm, J., Lichtenberg, K.D., Thorup, M.: Maintaining information in fully dynamic trees with top trees. ACM TALG 1(2), 243–264 (2005)

    Article  Google Scholar 

  2. Bille, P., Gørtz, I.L., Landau, G.M., Weimann, O.: Tree compression with top trees. Information and Computation (2015). http://doi.org/10.1016/j.ic.2014.12.012

  3. Bille, P., Landau, G.M., Raman, R., Sadakane, K., Satti, S.R., Weimann, O.: Random access to grammar-compressed strings. In: Proc. SODA, pp. 373–389. SIAM (2011)

    Google Scholar 

  4. Buneman, P., Grohe, M., Koch, C.: Path queries on compressed XML. In: Proc. 29th VLDB, pp. 141–152. VLDB Endowment (2003)

    Google Scholar 

  5. Busatto, G., Lohrey, M., Maneth, S.: Grammar-based tree compression. Tech. Rep. EPFL-REPORT-52615, École Polytechnique Fédérale de Lausanne (2004)

    Google Scholar 

  6. Busatto, G., Lohrey, M., Maneth, S.: Efficient memory representation of XML documents. In: Bierman, G., Koch, C. (eds.) DBPL 2005. LNCS, vol. 3774, pp. 199–216. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  7. Charikar, M., Lehman, E., Liu, D., Panigrahy, R., Prabhakaran, M., Sahai, A., Shelat, A.: The smallest grammar problem. IEEE Trans Inf Theory 51(7), 2554–2576 (2005)

    Article  MATH  MathSciNet  Google Scholar 

  8. Delpratt, O.D.: Space efficient in-memory representation of XML documents. Ph.D. thesis, University of Leicester, supervisor: Rajeev Raman (2009)

    Google Scholar 

  9. Downey, P.J., Sethi, R., Tarjan, R.E.: Variations on the common subexpression problem. Journal of the ACM (JACM) 27(4), 758–771 (1980)

    Article  MATH  MathSciNet  Google Scholar 

  10. Ferragina, P., Luccio, F., Manzini, G., Muthukrishnan, S.: Compressing and indexing labeled trees, with applications. Journal of the ACM (JACM) 57(1), 4 (2009)

    Article  MathSciNet  Google Scholar 

  11. Gog, S., Beller, T., Moffat, A., Petri, M.: From theory to practice: plug and play with succinct data structures. In: Gudmundsson, J., Katajainen, J. (eds.) SEA 2014. LNCS, vol. 8504, pp. 326–337. Springer, Heidelberg (2014)

    Google Scholar 

  12. Hirakawa, M., Tanaka, T., Hashimoto, Y., Kuroda, M., Takagi, T., Nakamura, Y.: JSNP: a database of common gene variations in the Japanese population. Nucleic Acids Research 30(1), 158–162 (2002). http://snp.ims.u-tokyo.ac.jp/XML/Mapped/old/20060612/

    Article  Google Scholar 

  13. Jacobson, G.: Space-efficient static trees and graphs. In: Proc. 30th FOCS, pp. 549–554. IEEE (1989)

    Google Scholar 

  14. Jez, A., Lohrey, M.: Approximation of smallest linear tree grammar. CoRR abs/1309.4958 (2013). http://arxiv.org/abs/1309.4958

  15. Larsson, N.J., Moffat, A.: Off-line dictionary-based compression. Proceedings of the IEEE 88(11), 1722–1732 (2000)

    Article  Google Scholar 

  16. Lohrey, M., Maneth, S.: The complexity of tree automata and XPath on grammar-compressed trees. Theoretical Computer Science 363(2), 196–210 (2006)

    Article  MATH  MathSciNet  Google Scholar 

  17. Lohrey, M., Maneth, S., Mennicke, R.: XML tree structure compression using RePair. Information Systems 38(8), 1150–1167 (2013)

    Article  Google Scholar 

  18. Maneth, S., Busatto, G.: Tree transducers and tree compressions. In: Walukiewicz, I. (ed.) FOSSACS 2004. LNCS, vol. 2987, pp. 363–377. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  19. Maruyama, S., Nakahara, M., Kishiue, N., Sakamoto, H.: ESP-index: A compressed index based on edit-sensitive parsing. Journal of Discrete Algorithms 18, 100–112 (2013)

    Article  MATH  MathSciNet  Google Scholar 

  20. Miklau, G.: University of Washington XML Repository. http://www.cs.washington.edu/research/xmldatasets

  21. Munro, J.I., Raman, V.: Succinct representation of balanced parentheses and static trees. SIAM Journal on Computing 31(3), 762–776 (2001)

    Article  MATH  MathSciNet  Google Scholar 

  22. Poyias, A.: XXML: Handling extra-large XML documents (2013). http://hdl.handle.net/2381/27744

  23. Pătraşcu, M.: Succincter. In: Proc. 49th FOCS, pp. 305–313. IEEE (2008)

    Google Scholar 

  24. Wang, F., Li, J., Homayounfar, H.: A space efficient XML DOM parser. Data & Knowledge Engineering 60(1), 185–207 (2007)

    Article  Google Scholar 

  25. Wikimedia: enwiki dump. http://dumps.wikimedia.org/enwiki/

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lorenz Hübschle-Schneider .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Hübschle-Schneider, L., Raman, R. (2015). Tree Compression with Top Trees Revisited. In: Bampis, E. (eds) Experimental Algorithms. SEA 2015. Lecture Notes in Computer Science(), vol 9125. Springer, Cham. https://doi.org/10.1007/978-3-319-20086-6_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-20086-6_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-20085-9

  • Online ISBN: 978-3-319-20086-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics