Abstract
Whenever the growing amount of XML data that has to be stored, processed, exchanged, or transmitted becomes a major cost driver or performance bottleneck, XML compression is an important way to reduce these problems. However, many applications, e.g. those exchanging XML data streams, also require efficient path query processing on the structure of compressed XML data streams. We present an XML compression technique called DAG+BSBC, which extends Bit-Stream-Based-Compression (BSBC) [3] by a sparse index to compressed constants that reflects DAG pointers. Furthermore, DAG+BSBC supports XML stream compression, queries on compressed data, and provides a compression ra tio that not only significantly outperforms that of other queriable XML compression tech niques, like XGrind, but is also very competitive compared to non-queriable compression tech niques like gzip and XMill.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Arion, A., Bonifati, A., Manolescu, I., Pugliese, A.: XQueC: A Query-Conscious Compressed XML Database. ACM Transactions on Internet Technology (to appear)
Bayardo, R.J., Gruhl, D., Josifovski, V., Myllymaki, J.: An evaluation of binary XML encoding optimizations for fast stream based XML processing. In: Proc. of the 13th international conference on World Wide Web (2004)
Böttcher, S., Hartel, R., Heinzemann, C.: Towards a succinct data format for XML streams. In: International Conference on Web Information Systems (WEBIST) (2008)
Böttcher, S., Steinmetz, R., Klein, N.: XML Index Compression by DTD Subtraction. In: International Conference on Enterprise Information Systems (ICEIS) (2007)
Böttcher, S., Steinmetz, R.: Data Management for Mobile Ajax Web 2.0 Applications. In: Wagner, R., Revell, N., Pernul, G. (eds.) DEXA 2007. LNCS, vol. 4653, pp. 424–433. Springer, Heidelberg (2007)
Buneman, P., Grohe, M., Koch, C.: Path Queries on Compressed XML. In: VLDB (2003)
Burrows, M., Wheeler, D.: A block sorting lossless data compression algorithm. Technical Report 124, Digital Equipment Corporation (1994)
Busatto, G., Lohrey, M., Maneth, S.: Efficient Memory Representation of XML Documents. In: Bierman, G., Koch, C. (eds.) DBPL 2005. LNCS, vol. 3774, pp. 199–216. Springer, Heidelberg (2005)
Candan, K.S., Hsiung, W.-P., Chen, S., Tatemura, J., Agrawal, D.: AFilter: Adaptable XML Filtering with Prefix-Caching and Suffix-Clustering. In: VLDB (2006)
Cheney, J.: Compressing XML with multiplexed hierarchical models. In: Proceedings of the 2001 IEEE Data Compression Conference (DCC 2001) (2001)
Cheng, J., Ng, W.: XQzip, Querying Compressed XML Using Structural Indexing. In: EDBT (2004)
Ferragina, P., Luccio, F., Manzini, G., Muthukrishnan, S.: Compressing and Searching XML Data Via Two Zips. In: Proceedings of the Fifteenth International World Wide Web Conference (2006)
Franceschet, M.: XPathMark: an XPath benchmark for XMark generated data. In: Bressan, S., Ceri, S., Hunt, E., Ives, Z.G., Bellahsène, Z., Rys, M., Unland, R. (eds.) XSym 2005. LNCS, vol. 3671, pp. 129–143. Springer, Heidelberg (2005)
Girardot, M., Sundaresan, N., Millau: An Encoding Format for Efficient Representation and Exchange of XML over the Web. In: Proceedings of the 9th International WWW Conference (2000)
Huffman, D.A.: A method for the construction of minimum-redundancy codes. In: Proc. of the I.R.E. (1952)
Liefke, H., Suciu, D.: XMill: An Efficient Compressor for XML Data. In: Proc. of ACM SIGMOD (2000)
Min, J.K., Park, M.J., Chung, C.W.: XPRESS: A Queriable Compression for XML Data. In: Proceedings of SIGMOD (2003)
Ng, W., Lam, W.Y., Wood, P.T., Levene, M.: XCQ: A queriable XML compression system. Knowledge and Information Systems (2006)
Olteanu, D., Meuss, H., Furche, T., Bry, F.: XPath: Looking Forward. In: Chaudhri, A.B., Unland, R., Djeraba, C., Lindner, W. (eds.) EDBT 2002. LNCS, vol. 2490, pp. 109–127. Springer, Heidelberg (2002)
Schmidt, A., Waas, F., Kersten, M., Carey, M., Manolescu, I., Busse, R.: XMark: A benchmark for XML data management, Hong Kong, China (2002)
Tolani, P.M., Hartisa, J.R.: XGRIND: A query-friendly XML compressor. In: Proc. ICDE (2002)
Yao, B.B., Ă–zsu, M.T.: XBench - A family of benchmarks for XML DBMS (2002)
Zhang, N., Kacholia, V., Ă–zsu, M.T.: A Succinct Physical Storage Scheme for Efficient Evaluation of Path Queries in XML. In: ICDE (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Böttcher, S., Hartel, R., Heinzemann, C. (2009). Compressing XML Data Streams with DAG+BSBC. In: Cordeiro, J., Hammoudi, S., Filipe, J. (eds) Web Information Systems and Technologies. WEBIST 2008. Lecture Notes in Business Information Processing, vol 18. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-01344-7_6
Download citation
DOI: https://doi.org/10.1007/978-3-642-01344-7_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-01343-0
Online ISBN: 978-3-642-01344-7
eBook Packages: Computer ScienceComputer Science (R0)