Dictionary-Symbolwise Flexible Parsing

  • Maxime Crochemore
  • Laura Giambruno
  • Alessio Langiu
  • Filippo Mignosi
  • Antonio Restivo
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6460)


Linear time optimal parsing algorithms are very rare in the dictionary based branch of the data compression theory. The most recent is the Flexible Parsing algorithm of Mathias and Shainalp that works when the dictionary is prefix closed and the encoding of dictionary pointers has a constant cost. We present the Dictionary − Symbolwise Flexible Parsing algorithm that is optimal for prefix-closed dictionaries and any symbolwise compressor under some natural hypothesis. In the case of LZ78-alike algorithms with variable costs and any, linear as usual, symbolwise compressor can be implemented in linear time. In the case of LZ77-alike dictionaries and any symbolwise compressor it can be implemented in O(n logn) time. We further present some experimental results that show the effectiveness of the dictionary-symbolwise approach.


Compression Ratio Minimal Path Parsing Algorithm Single Source Short Path Simple Data Structure 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Bell, T.C., Witten, I.H.: The relationship between greedy parsing and symbolwise text compression. J. ACM 41(4), 708–724 (1994)CrossRefzbMATHGoogle Scholar
  2. 2.
    Cohn, M., Khazan, R.: Parsing with prefix and suffix dictionaries. In: Data Compression Conference, pp. 180–189 (1996)Google Scholar
  3. 3.
    Cormen, T.H., Leiserson, C.E., Rivest, R.L., Stein, C.: Introduction to Algorithms, 2nd edn. MIT Press, Cambridge (2001)zbMATHGoogle Scholar
  4. 4.
    Ferragina, P., Nitto, I., Venturini, R.: On the bit-complexity of lempel-ziv compression. In: Proceedings of the Nineteenth Annual ACM -SIAM Symposium on Discrete Algorithms, SODA 2009, pp. 768–777. Society for Industrial and Applied Mathematics, Philadelphia (2009)CrossRefGoogle Scholar
  5. 5.
    Gzip’s Home Page,
  6. 6.
    Hartman, A., Rodeh, M.: Optimal parsing of strings, pp. 155–167. Springer, Heidelberg (1985)zbMATHGoogle Scholar
  7. 7.
    Horspool, R.N.: The effect of non-greedy parsing in ziv-lempel compression methods. In: Data Compression Conference (1995)Google Scholar
  8. 8.
    Katajainen, J., Raita, T.: An approximation algorithm for space-optimal encoding of a text. Comput. J. 32(3), 228–237 (1989)CrossRefGoogle Scholar
  9. 9.
    Katajainen, J., Raita, T.: An analysis of the longest match and the greedy heuristics in text encoding. J. ACM 39(2), 281–294 (1992)CrossRefzbMATHGoogle Scholar
  10. 10.
    Katz, P.: Pkzip archiving tool (1989),
  11. 11.
    Kim, T.Y., Kim, T.: On-line optimal parsing in dictionary-based coding adaptive. Electronic Letters 34(11), 1071–1072 (1998)CrossRefGoogle Scholar
  12. 12.
    Klein, S.T.: Efficient optimal recompression. Comput. J. 40(2/3), 117–126 (1997)CrossRefGoogle Scholar
  13. 13.
    Mahoney, M.: Large text compression benchmark,
  14. 14.
    Martelock, C.: Rzm order-1 rolz compressor (April 2008),
  15. 15.
    Matias, Y., Rajpoot, N., Shainalp, S.C.: The effect of flexible parsing for dynamic dictionary-based data compression. ACM Journal of Experimental Algorithms 6, 10 (2001)CrossRefGoogle Scholar
  16. 16.
    Matias, Y., Shainalp, S.C.: On the optimality of parsing in dynamic dictionary based data compression. In: SODA, pp. 943–944 (1999)Google Scholar
  17. 17.
    Della Penna, G., Langiu, A., Mignosi, F., Ulisse, A.: Optimal parsing in dictionary-symbolwise data compression schemes (2006),
  18. 18.
    Schuegraf, E.J., Heaps, H.S.: A comparison of algorithms for data base compression by use of fragments as language elements. Information Storage and Retrieval 10(9-10), 309–319 (1974)CrossRefzbMATHGoogle Scholar
  19. 19.
    Storer, J.A., Szymanski, T.G.: Data compression via textural substitution. J. ACM 29(4), 928–951 (1982)CrossRefzbMATHGoogle Scholar
  20. 20.
    Wagner, R.A.: Common phrases and minimum-space text storage. ACM Commun. 16(3), 148–152 (1973)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Maxime Crochemore
    • 1
    • 4
  • Laura Giambruno
    • 2
  • Alessio Langiu
    • 2
    • 4
  • Filippo Mignosi
    • 3
  • Antonio Restivo
    • 2
  1. 1.Dept. of Computer ScienceKing’s College LondonLondonUK
  2. 2.Dipartimento di Matematica e InformaticaUniversità di PalermoPalermoItaly
  3. 3.Dipartimento di InformaticaUniversità dell’AquilaL’AquilaItaly
  4. 4.Institut Gaspard-MongeUniversité Paris-EstFrance

Personalised recommendations