Skip to main content

Benchmarking the Compression of XML Node Streams

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6193))

Abstract

In recent years, many approaches to XML twig pattern query processing have been developed. Holistic approaches are particularly significant in that they provide a theoretical model for optimal processing of some query classes and have very low main memory complexity. Holistic algorithms are supported by a stream abstract data type. This data type is usually implemented using inverted lists or special purpose data structures. In this article, we focus on an efficient implementation of a stream ADT. We utilize previously proposed fast decoding algorithms for some prefix variable-length codes, like Elias-delta, Fibonacci of order 2 and 3 as well as Elias-Fibonacci codes. We compare the efficiency of the access to a stream using various decompression algorithms. These results are compared with the result of data structures where no compression is used. We show that the compression improves the efficiency of XML query processing.

Work is partially supported by Grants of GACR No. P202/10/0573 and SGS, Technical University of Ostrava, No. SP/2010138, Czech Republic.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Al-Khalifa, S., Jagadish, H.V., Koudas, N.: Structural Joins: A Primitive for Efficient XML Query Pattern Matching. In: Proceedings of ICDE 2002, pp. 141–152. IEEE CS, Los Alamitos (2002)

    Google Scholar 

  2. Apostolico, A., Fraenkel, A.: Robust Transmission of Unbounded Strings Using Fibonacci Representations. IEEE Transactions on Information Theory 33(2), 238–245 (1987)

    Article  MATH  MathSciNet  Google Scholar 

  3. Bača, R., Krátký, M.: TJDewey – On the Efficient Path Labeling Scheme Holistic Approach. In: Chen, L., Liu, C., Liu, Q., Deng, K. (eds.) Database Systems for Advanced Applications. LNCS, vol. 5667, pp. 6–20. Springer, Heidelberg (2009)

    Google Scholar 

  4. Bača, R., Pawlas, M.: Compression of the Stream Array Data Structure. In: Proceedings of the 9th Annual International Workshop on DAtabases, TExts, Specfications and Objects, DATESO 2009. CEUR Workshop Proceedings, vol. 471, pp. 23–31 (2009)

    Google Scholar 

  5. Bruno, N., Srivastava, D., Koudas, N.: Holistic Twig Joins: Optimal XML Pattern Matching. In: Proceedings of ACM SIGMOD 2002, pp. 310–321. ACM Press, New York (2002)

    Chapter  Google Scholar 

  6. Chen, S., Li, H.-G., Tatemura, J., Hsiung, W.-P., Agrawal, D., Candan, K.S.: Twig2Stack: Bottom-up Processing of Generalized-tree-pattern Queries Over XML documents. In: Proceedings of VLDB 2006, pp. 283–294 (2006)

    Google Scholar 

  7. Chen, Z., Korn, G., Koudas, F., Shanmugasundaram, N., Srivastava, J.: Index Structures for Matching XML Twigs Using Relational Query Processors. In: Proceedings of ICDE 2005, p. 1273. IEEE CS, Los Alamitos (2005)

    Google Scholar 

  8. Elias, P.: Universal Codeword Sets and Representations of the Integers. IEEE Transactions on Information Theory 21(2), 194–203 (1975)

    Article  MATH  MathSciNet  Google Scholar 

  9. Fraenkel, A., Klein, S.: Robust Universal Complete Codes as Alternatives to Huffiman Codes. Technical Report Tech. Report CS85-16, Dept. of Appl. Math., The Weizmann Institute of Science, Rehovot (1985)

    Google Scholar 

  10. Garcia-Molina, H., Ullman, J., Widom, J.: Database Systems: The Complete Book. Prentice Hall, Englewood Cliffs (2002)

    Google Scholar 

  11. Grust, T., van Keulen, M., Teubner, J.: Staircase Join: Teach a Relational DBMS to Watch Its (Axis) Steps. In: Proceedings of VLDB 2003, pp. 524–535 (2003)

    Google Scholar 

  12. Jiang, H., Lu, H., Wang, W., Ooi, B.: XR-Tree: Indexing XML Data for Efficient Structural Join. In: Proceedings of ICDE, India, pp. 253–264. IEEE CS, Los Alamitos (2003)

    Google Scholar 

  13. Krátký, M., Pokorný, J., Snášel, V.: Implementation of XPath Axes in the Multi-dimensional Approach to Indexing XML Data. In: Lindner, W., Mesiti, M., Türker, C., Tzitzikas, Y., Vakali, A.I. (eds.) EDBT 2004. LNCS, vol. 3268, pp. 219–229. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  14. Leonardo of Pisa (known as Fibonacci). Liber Abaci. 1202

    Google Scholar 

  15. Salomon, D.: Data Compression: The Complete Reference, 3rd edn. Springer, New York (2004)

    MATH  Google Scholar 

  16. Tatarinov, I., et al.: Storing and Querying Ordered XML Using a Relational Database System. In: Proceedings of ACM SIGMOD 2002, pp. 204–215. ACM Press, New York (2002)

    Chapter  Google Scholar 

  17. Walder, J., Krátký, M., Bača, R.: Benchmarking Coding Algorithms for the R-tree Compression. In: Proceedings of the 9th Annual International Workshop on Databases, Texts, Specifications and Objects, DATESO 2009. CEUR Workshop Proceedings, vol. 471, pp. 32–43 (2009)

    Google Scholar 

  18. Walder, J., Krátký, M., Bača, R., Platoš, J., Snášel, V.: Fast Decoding Algorithms for Variable-Lengths Codes. Submitted in Information Science (February 2010)

    Google Scholar 

  19. Williams, H.E., Zobel, J.: Compressing Integers for Fast File Access. The Computer Journal 42(3), 193–201 (1999)

    Article  Google Scholar 

  20. Zhang, C., Naughton, J., DeWitt, D., Luo, Q., Lohman, G.: On Supporting Containment Queries in Relational Database Management Systems. In: Proceedings of ACM SIGMOD 2001, pp. 425–436. ACM Press, New York (2001)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Bača, R., Walder, J., Pawlas, M., Krátký, M. (2010). Benchmarking the Compression of XML Node Streams. In: Yoshikawa, M., Meng, X., Yumoto, T., Ma, Q., Sun, L., Watanabe, C. (eds) Database Systems for Advanced Applications. DASFAA 2010. Lecture Notes in Computer Science, vol 6193. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-14589-6_19

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-14589-6_19

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-14588-9

  • Online ISBN: 978-3-642-14589-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics