Skip to main content

Parallelization of Permuting XML Compressors

  • Conference paper
  • First Online:
Parallel Processing and Applied Mathematics (PPAM 2013)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 8384))

  • 1536 Accesses

Abstract

The verbose nature of XML results in overheads in storage and network transfers, which may be overcome by using parallel computing. This paper presents four permuting parallel XML compressors, based on an existing XML compressor, called XSAQCT. Tests were performed on multi-core machines using a test suite incorporating XML documents with various characteristics, and results were analyzed to find upper bounds given by Amdahl’s law, the actual speedup, and compression ratios.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Consortium, T.U.: Update on activities at the Universal Protein Resource (UniProt) in 2013. http://dx.doi.org/10.1093/nar/gks1068 (2013). Accessed 20 June 2013

  2. CreativeCommons: Stack Overflow Creative Commons data dump. http://blog.stackoverflow.com/?s=Data+Dump (2011). Accessed 20 June 2013

  3. enwiki dumps: enwiki-latest.xml. http://dumps.wikimedia.org/enwiki/latest/ (2012). Accessed 20 June 2013

  4. Liefke, H., Suciu, D.: XMill: an efficient compressor for XML data. In: Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, SIGMOD ’00, pp. 153–164. ACM, New York. http://doi.acm.org/10.1145/342009.335405 (2000)

  5. Müldner, T., Fry, C., Corbin, T., Miziołek, J.K.: Parallelization of an xml data compressor on multi-cores. In: Wyrzykowski, R., Dongarra, J., Karczewski, K., Waśniewski, J. (eds.) PPAM 2011, Part II. LNCS, vol. 7204, pp. 101–110. Springer, Heidelberg (2012). http://dx.doi.org/10.1007/978-3-642-31500-8_11

    Google Scholar 

  6. Müldner, T., Fry, C., Miziołek, J., Durno, S.: SXSAQCT and XSAQCT: XML queryable compressors. In: S. Böttcher, M. Lohrey, S.M., Rytter, W. (eds.) Structure-Based Compression of Complex Massive Data. No. 08261 in Dagstuhl Seminar Proceedings. http://drops.dagstuhl.de/opus/volltexte/2008/1673 (2008)

  7. Oracle: Berkeley DB Java edition architecture. http://www.oracle.com/technetwork/database/berkeleydb/overview/index-093405.html (2013) Accessed 20 June 2013

  8. Wratislavia: Wratislavia XML corpus. http://www.ii.uni.wroc.pl/~inikep/research/Wratislavia/ (2012). Accessed 20 June 2013

  9. XML: Extensible markup language (XML) 1.0, 5th edn. http://www.w3.org/TR/REC-xml/ (2013). Accessed 20 June 2013

  10. xmlgen: The Benchmark Data Generator. http://www.xml-benchmark.org/generator.html (2012). Accessed 20 June 2013

Download references

Acknowledgments

The work of the first author is partially supported by NSERC CSG-M and the work of the second author by the NSERC RGPIN grant.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tomasz Müldner .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Corbin, T., Müldner, T., Miziołek, J.K. (2014). Parallelization of Permuting XML Compressors. In: Wyrzykowski, R., Dongarra, J., Karczewski, K., Waśniewski, J. (eds) Parallel Processing and Applied Mathematics. PPAM 2013. Lecture Notes in Computer Science(), vol 8384. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-55224-3_31

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-55224-3_31

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-55223-6

  • Online ISBN: 978-3-642-55224-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics