Skip to main content

On the Value of Multiple Read/Write Streams for Data Compression

  • Conference paper
Combinatorial Pattern Matching (CPM 2009)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 5577))

Included in the following conference series:

Abstract

We study whether, when restricted to using polylogarithmic memory and polylogarithmic passes, we can achieve qualitatively better data compression with multiple read/write streams than we can with only one. We first show how we can achieve universal compression using only one pass over one stream. We then show that one stream is not sufficient for us to achieve good grammar-based compression. Finally, we show that two streams are necessary and sufficient for us to achieve entropy-only bounds.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Aggarwal, G., Datar, M., Rajagopalan, S., Ruhl, M.: On the streaming model augmented with a sorting primitive. In: Proceedings of the 45th Symposium on Foundations of Computer Science, pp. 540–549 (2004)

    Google Scholar 

  2. Arge, L., Bender, M.A., Demaine, E.D., Holland-Minkley, B., Munro, J.I.: An optimal cache-oblivious priority queue and its application to graph algorithms. SIAM Journal on Computing 36(6), 1672–1695 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  3. Beame, P., Huỳnh-Ngọc, D.-T.: On the value of multiple read/write streams for approximating frequency moments. In: Proceedings of the 49th Symposium on Foundations of Computer Science, pp. 499–508 (2008)

    Google Scholar 

  4. Bird, R.S., Mu, S.-C.: Inverting the Burrows-Wheeler transform. Journal of Functional Programming 14(6), 603–612 (2004)

    Article  MATH  Google Scholar 

  5. Burrows, M., Wheeler, D.J.: A block-sorting lossless data compression algorithm. Technical Report 24, Digital Equipment Corporation (1994)

    Google Scholar 

  6. Charikar, M., Lehman, E., Liu, D., Panigrahy, R., Prabhakaran, M., Sahai, A., shelat, a.: The smallest grammar problem. IEEE Transactions on Information Theory 51(7), 2554–2576 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  7. Chen, J., Yap, C.-K.: Reversal complexity. SIAM Journal on Computing 20(4), 622–638 (1991)

    Article  MathSciNet  MATH  Google Scholar 

  8. Cilibrasi, R., Vitányi, P.: Clustering by compression. IEEE Transactions on Information Theory 51(4), 1523–1545 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  9. Cover, T.M., Thomas, J.A.: Elements of Information Theory, 2nd edn. Wiley, Chichester (2006)

    MATH  Google Scholar 

  10. de Bruijn, N.G.: A combinatorial problem. Koninklijke Nederlandse Akademie van Wetenschappen 49, 758–764 (1946)

    MATH  Google Scholar 

  11. Ferragina, P., Manzini, G., Mäkinen, V., Navarro, G.: Compressed representations of sequences and full-text indexes. ACM Transactions on Algorithms 3(2) (2007)

    Google Scholar 

  12. Gagie, T., Manzini, G.: Move-to-front, distance coding, and inversion frequencies revisited. In: Proceedings of the 18th Symposium on Combinatorial Pattern Matching, pp. 71–82 (2007)

    Google Scholar 

  13. Gagie, T., Manzini, G.: Space-conscious compression. In: Proceedings of the 32nd Symposium on Mathematical Foundations of Computer Science, pp. 206–217 (2007)

    Google Scholar 

  14. Grohe, M., Koch, C., Schweikardt, N.: Tight lower bounds for query processing on streaming and external memory data. Theoretical Computer Science 380(1–3), 199–217 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  15. Grohe, M., Schweikardt, N.: Lower bounds for sorting with few random accesses to external memory. In: Proceedings of the 24th Symposium on Principles of Database Systems, pp. 238–249 (2005)

    Google Scholar 

  16. Gupta, A., Grossi, R., Vitter, J.S.: Nearly tight bounds on the encoding length of the Burrows-Wheeler Transform. In: Proceedings of the 4th Workshop on Analytic Algorithmics and Combinatorics, pp. 191–202 (2008)

    Google Scholar 

  17. Hernich, A., Schweikardt, N.: Reversal complexity revisited. Theoretical Computer Science 401(1–3), 191–205 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  18. Knuth, D.E.: The Art of Computer Programming, 2nd edn., vol. 3. Addison-Wesley, Reading (1998)

    MATH  Google Scholar 

  19. Kosaraju, R., Manzini, G.: Compression of low entropy strings with Lempel-Ziv algorithms. SIAM Journal on Computing 29(3), 893–911 (1999)

    Article  MathSciNet  MATH  Google Scholar 

  20. Manzini, G.: An analysis of the Burrows-Wheeler Transform. Journal of the ACM 48(3), 407–430 (2001)

    Article  MathSciNet  MATH  Google Scholar 

  21. Munro, J.I., Paterson, M.S.: Selection and sorting with limited storage. Theoretical Computer Science 12, 315–323 (1980)

    Article  MathSciNet  MATH  Google Scholar 

  22. Muthukrishnan, S.: Data Streams: Algorithms and Applications. In: Foundations and Trends in Theoretical Computer Science. Now Publishers (2005)

    Google Scholar 

  23. Ruhl, J.M.: Efficient Algorithms for New Computational Models. PhD thesis, Massachusetts Institute of Technology (2003)

    Google Scholar 

  24. Rytter, W.: Application of Lempel-Ziv factorization to the approximation of grammar-based compression. Theoretical Computer Science 302(1–3), 211–222 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  25. Savari, S.: Redundancy of the Lempel-Ziv incremental parsing rule. IEEE Transactions on Information Theory 43(1), 9–21 (1997)

    Article  MathSciNet  MATH  Google Scholar 

  26. Schweikardt, N.: Machine models and lower bounds for query processing. In: Proceedings of the 26th Symposium on Principles of Database Systems, pp. 41–52 (2007)

    Google Scholar 

  27. Ziv, J., Lempel, A.: A universal algorithm for sequential data compression. IEEE Transactions on Information Theory 23(3), 337–343 (1977)

    Article  MathSciNet  MATH  Google Scholar 

  28. Ziv, J., Lempel, A.: Compression of individual sequences via variable-rate coding. IEEE Transactions on Information Theory 24(5), 530–536 (1978)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Gagie, T. (2009). On the Value of Multiple Read/Write Streams for Data Compression. In: Kucherov, G., Ukkonen, E. (eds) Combinatorial Pattern Matching. CPM 2009. Lecture Notes in Computer Science, vol 5577. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-02441-2_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-02441-2_7

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-02440-5

  • Online ISBN: 978-3-642-02441-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics