Abstract
In recent years, many approaches to XML twig pattern query processing have been developed. Holistic approaches are particularly significant in that they provide a theoretical model for optimal processing of some query classes and have very low main memory complexity. Holistic algorithms are supported by a stream abstract data type. This data type is usually implemented using inverted lists or special purpose data structures. In this article, we focus on an efficient implementation of a stream ADT. We utilize previously proposed fast decoding algorithms for some prefix variable-length codes, like Elias-delta, Fibonacci of order 2 and 3 as well as Elias-Fibonacci codes. We compare the efficiency of the access to a stream using various decompression algorithms. These results are compared with the result of data structures where no compression is used. We show that the compression improves the efficiency of XML query processing.
Work is partially supported by Grants of GACR No. P202/10/0573 and SGS, Technical University of Ostrava, No. SP/2010138, Czech Republic.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Al-Khalifa, S., Jagadish, H.V., Koudas, N.: Structural Joins: A Primitive for Efficient XML Query Pattern Matching. In: Proceedings of ICDE 2002, pp. 141–152. IEEE CS, Los Alamitos (2002)
Apostolico, A., Fraenkel, A.: Robust Transmission of Unbounded Strings Using Fibonacci Representations. IEEE Transactions on Information Theory 33(2), 238–245 (1987)
Bača, R., Krátký, M.: TJDewey – On the Efficient Path Labeling Scheme Holistic Approach. In: Chen, L., Liu, C., Liu, Q., Deng, K. (eds.) Database Systems for Advanced Applications. LNCS, vol. 5667, pp. 6–20. Springer, Heidelberg (2009)
Bača, R., Pawlas, M.: Compression of the Stream Array Data Structure. In: Proceedings of the 9th Annual International Workshop on DAtabases, TExts, Specfications and Objects, DATESO 2009. CEUR Workshop Proceedings, vol. 471, pp. 23–31 (2009)
Bruno, N., Srivastava, D., Koudas, N.: Holistic Twig Joins: Optimal XML Pattern Matching. In: Proceedings of ACM SIGMOD 2002, pp. 310–321. ACM Press, New York (2002)
Chen, S., Li, H.-G., Tatemura, J., Hsiung, W.-P., Agrawal, D., Candan, K.S.: Twig2Stack: Bottom-up Processing of Generalized-tree-pattern Queries Over XML documents. In: Proceedings of VLDB 2006, pp. 283–294 (2006)
Chen, Z., Korn, G., Koudas, F., Shanmugasundaram, N., Srivastava, J.: Index Structures for Matching XML Twigs Using Relational Query Processors. In: Proceedings of ICDE 2005, p. 1273. IEEE CS, Los Alamitos (2005)
Elias, P.: Universal Codeword Sets and Representations of the Integers. IEEE Transactions on Information Theory 21(2), 194–203 (1975)
Fraenkel, A., Klein, S.: Robust Universal Complete Codes as Alternatives to Huffiman Codes. Technical Report Tech. Report CS85-16, Dept. of Appl. Math., The Weizmann Institute of Science, Rehovot (1985)
Garcia-Molina, H., Ullman, J., Widom, J.: Database Systems: The Complete Book. Prentice Hall, Englewood Cliffs (2002)
Grust, T., van Keulen, M., Teubner, J.: Staircase Join: Teach a Relational DBMS to Watch Its (Axis) Steps. In: Proceedings of VLDB 2003, pp. 524–535 (2003)
Jiang, H., Lu, H., Wang, W., Ooi, B.: XR-Tree: Indexing XML Data for Efficient Structural Join. In: Proceedings of ICDE, India, pp. 253–264. IEEE CS, Los Alamitos (2003)
Krátký, M., Pokorný, J., Snášel, V.: Implementation of XPath Axes in the Multi-dimensional Approach to Indexing XML Data. In: Lindner, W., Mesiti, M., Türker, C., Tzitzikas, Y., Vakali, A.I. (eds.) EDBT 2004. LNCS, vol. 3268, pp. 219–229. Springer, Heidelberg (2004)
Leonardo of Pisa (known as Fibonacci). Liber Abaci. 1202
Salomon, D.: Data Compression: The Complete Reference, 3rd edn. Springer, New York (2004)
Tatarinov, I., et al.: Storing and Querying Ordered XML Using a Relational Database System. In: Proceedings of ACM SIGMOD 2002, pp. 204–215. ACM Press, New York (2002)
Walder, J., Krátký, M., Bača, R.: Benchmarking Coding Algorithms for the R-tree Compression. In: Proceedings of the 9th Annual International Workshop on Databases, Texts, Specifications and Objects, DATESO 2009. CEUR Workshop Proceedings, vol. 471, pp. 32–43 (2009)
Walder, J., Krátký, M., Bača, R., Platoš, J., Snášel, V.: Fast Decoding Algorithms for Variable-Lengths Codes. Submitted in Information Science (February 2010)
Williams, H.E., Zobel, J.: Compressing Integers for Fast File Access. The Computer Journal 42(3), 193–201 (1999)
Zhang, C., Naughton, J., DeWitt, D., Luo, Q., Lohman, G.: On Supporting Containment Queries in Relational Database Management Systems. In: Proceedings of ACM SIGMOD 2001, pp. 425–436. ACM Press, New York (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Bača, R., Walder, J., Pawlas, M., Krátký, M. (2010). Benchmarking the Compression of XML Node Streams. In: Yoshikawa, M., Meng, X., Yumoto, T., Ma, Q., Sun, L., Watanabe, C. (eds) Database Systems for Advanced Applications. DASFAA 2010. Lecture Notes in Computer Science, vol 6193. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-14589-6_19
Download citation
DOI: https://doi.org/10.1007/978-3-642-14589-6_19
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-14588-9
Online ISBN: 978-3-642-14589-6
eBook Packages: Computer ScienceComputer Science (R0)