Skip to main content

Compressing XML Data Streams with DAG+BSBC

  • Conference paper
Web Information Systems and Technologies (WEBIST 2008)

Part of the book series: Lecture Notes in Business Information Processing ((LNBIP,volume 18))

Included in the following conference series:

Abstract

Whenever the growing amount of XML data that has to be stored, processed, exchanged, or transmitted becomes a major cost driver or performance bottleneck, XML compression is an important way to reduce these problems. However, many applications, e.g. those exchanging XML data streams, also require efficient path query processing on the structure of compressed XML data streams. We present an XML compression technique called DAG+BSBC, which extends Bit-Stream-Based-Compression (BSBC) [3] by a sparse index to compressed constants that reflects DAG pointers. Furthermore, DAG+BSBC supports XML stream compression, queries on compressed data, and provides a compression ra tio that not only significantly outperforms that of other queriable XML compression tech niques, like XGrind, but is also very competitive compared to non-queriable compression tech niques like gzip and XMill.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Arion, A., Bonifati, A., Manolescu, I., Pugliese, A.: XQueC: A Query-Conscious Compressed XML Database. ACM Transactions on Internet Technology (to appear)

    Google Scholar 

  2. Bayardo, R.J., Gruhl, D., Josifovski, V., Myllymaki, J.: An evaluation of binary XML encoding optimizations for fast stream based XML processing. In: Proc. of the 13th international conference on World Wide Web (2004)

    Google Scholar 

  3. Böttcher, S., Hartel, R., Heinzemann, C.: Towards a succinct data format for XML streams. In: International Conference on Web Information Systems (WEBIST) (2008)

    Google Scholar 

  4. Böttcher, S., Steinmetz, R., Klein, N.: XML Index Compression by DTD Subtraction. In: International Conference on Enterprise Information Systems (ICEIS) (2007)

    Google Scholar 

  5. Böttcher, S., Steinmetz, R.: Data Management for Mobile Ajax Web 2.0 Applications. In: Wagner, R., Revell, N., Pernul, G. (eds.) DEXA 2007. LNCS, vol. 4653, pp. 424–433. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  6. Buneman, P., Grohe, M., Koch, C.: Path Queries on Compressed XML. In: VLDB (2003)

    Google Scholar 

  7. Burrows, M., Wheeler, D.: A block sorting lossless data compression algorithm. Technical Report 124, Digital Equipment Corporation (1994)

    Google Scholar 

  8. Busatto, G., Lohrey, M., Maneth, S.: Efficient Memory Representation of XML Documents. In: Bierman, G., Koch, C. (eds.) DBPL 2005. LNCS, vol. 3774, pp. 199–216. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  9. Candan, K.S., Hsiung, W.-P., Chen, S., Tatemura, J., Agrawal, D.: AFilter: Adaptable XML Filtering with Prefix-Caching and Suffix-Clustering. In: VLDB (2006)

    Google Scholar 

  10. Cheney, J.: Compressing XML with multiplexed hierarchical models. In: Proceedings of the 2001 IEEE Data Compression Conference (DCC 2001) (2001)

    Google Scholar 

  11. Cheng, J., Ng, W.: XQzip, Querying Compressed XML Using Structural Indexing. In: EDBT (2004)

    Google Scholar 

  12. Ferragina, P., Luccio, F., Manzini, G., Muthukrishnan, S.: Compressing and Searching XML Data Via Two Zips. In: Proceedings of the Fifteenth International World Wide Web Conference (2006)

    Google Scholar 

  13. Franceschet, M.: XPathMark: an XPath benchmark for XMark generated data. In: Bressan, S., Ceri, S., Hunt, E., Ives, Z.G., Bellahsène, Z., Rys, M., Unland, R. (eds.) XSym 2005. LNCS, vol. 3671, pp. 129–143. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  14. Girardot, M., Sundaresan, N., Millau: An Encoding Format for Efficient Representation and Exchange of XML over the Web. In: Proceedings of the 9th International WWW Conference (2000)

    Google Scholar 

  15. Huffman, D.A.: A method for the construction of minimum-redundancy codes. In: Proc. of the I.R.E. (1952)

    Google Scholar 

  16. Liefke, H., Suciu, D.: XMill: An Efficient Compressor for XML Data. In: Proc. of ACM SIGMOD (2000)

    Google Scholar 

  17. Min, J.K., Park, M.J., Chung, C.W.: XPRESS: A Queriable Compression for XML Data. In: Proceedings of SIGMOD (2003)

    Google Scholar 

  18. Ng, W., Lam, W.Y., Wood, P.T., Levene, M.: XCQ: A queriable XML compression system. Knowledge and Information Systems (2006)

    Google Scholar 

  19. Olteanu, D., Meuss, H., Furche, T., Bry, F.: XPath: Looking Forward. In: Chaudhri, A.B., Unland, R., Djeraba, C., Lindner, W. (eds.) EDBT 2002. LNCS, vol. 2490, pp. 109–127. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  20. Schmidt, A., Waas, F., Kersten, M., Carey, M., Manolescu, I., Busse, R.: XMark: A benchmark for XML data management, Hong Kong, China (2002)

    Google Scholar 

  21. Tolani, P.M., Hartisa, J.R.: XGRIND: A query-friendly XML compressor. In: Proc. ICDE (2002)

    Google Scholar 

  22. Yao, B.B., Ă–zsu, M.T.: XBench - A family of benchmarks for XML DBMS (2002)

    Google Scholar 

  23. Zhang, N., Kacholia, V., Ă–zsu, M.T.: A Succinct Physical Storage Scheme for Efficient Evaluation of Path Queries in XML. In: ICDE (2004)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Böttcher, S., Hartel, R., Heinzemann, C. (2009). Compressing XML Data Streams with DAG+BSBC. In: Cordeiro, J., Hammoudi, S., Filipe, J. (eds) Web Information Systems and Technologies. WEBIST 2008. Lecture Notes in Business Information Processing, vol 18. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-01344-7_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-01344-7_6

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-01343-0

  • Online ISBN: 978-3-642-01344-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics