Skip to main content

Improved Approximate String Matching and Regular Expression Matching on Ziv-Lempel Compressed Texts

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 4580))

Abstract

We study the approximate string matching and regular expression matching problem for the case when the text to be searched is compressed with the Ziv-Lempel adaptive dictionary compression schemes. We present a time-space trade-off that leads to algorithms improving the previously known complexities for both problems. In particular, we significantly improve the space bounds. In practical applications the space is likely to be a bottleneck and therefore this is of crucial importance.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Amir, A., Benson, G.: Efficient two-dimensional compressed matching. In: Proceedings of the 2nd Data Compression Conference, pp. 279–288 (1992)

    Google Scholar 

  2. Amir, A., Benson, G.: Two-dimensional periodicity and its applications. In: Proceedings of the 3rd Symposium on Discrete algorithms, pp. 440–452 (1992)

    Google Scholar 

  3. Amir, A., Benson, G., Farach, M.: Let sleeping files lie: pattern matching in Z-compressed files. J. Comput. Syst. Sci. 52(2), 299–307 (1996)

    Article  MathSciNet  Google Scholar 

  4. Bille, P.: New algorithms for regular expression matching. In: Proceedings of the 33rd International Colloquium on Automata, Languages and Programming, pp. 643–654 (2006)

    Google Scholar 

  5. Bille, P., Fagerberg, R., Gørtz, I.L.: Improved approximate string matching and regular expression matching on ziv-lempel compressed texts (2007), Draft of full version available at arxiv.org/cs/DS/0609085

  6. Bille, P., Farach-Colton, M.: Fast and compact regular expression matching, Submitted to a journal (2005), Preprint availiable at arxiv.org/cs/0509069

  7. Cole, R., Hariharan, R.: Approximate string matching: A simpler faster algorithm. SIAM J. Comput. 31(6), 1761–1782 (2002)

    Article  MATH  MathSciNet  Google Scholar 

  8. Dietzfelbinger, M., Karlin, A., Mehlhorn, K., auf der Heide, F.M., Rohnert, H., Tarjan, R.: Dynamic perfect hashing: Upper and lower bounds. SIAM J. Comput. 23(4), 738–761 (1994)

    Article  MATH  MathSciNet  Google Scholar 

  9. Farach, M., Thorup, M.: String matching in Lempel-Ziv compressed strings. Algorithmica 20(4), 388–404 (1998)

    Article  MATH  MathSciNet  Google Scholar 

  10. Kärkkäinen, J., Navarro, G., Ukkonen, E.: Approximate string matching on Ziv-Lempel compressed text. J. Discrete Algorithms 1(3-4), 313–338 (2003)

    Article  MATH  MathSciNet  Google Scholar 

  11. Kida, T., Takeda, M., Shinohara, A., Miyazaki, M., Arikawa, S.: Multiple pattern matching in LZW compressed text. In: Proceedings of the 8th Data Compression Conference, pp. 103–112 (1998)

    Google Scholar 

  12. Landau, G.M., Vishkin, U.: Fast parallel and serial approximate string matching. J. Algorithms 10(2), 157–169 (1989)

    Article  MATH  MathSciNet  Google Scholar 

  13. Mäkinen, V., Ukkonen, E., Navarro, G.: Approximate matching of run-length compressed strings. Algorithmica 35(4), 347–369 (2003)

    Article  MATH  MathSciNet  Google Scholar 

  14. Matsumoto, T., Kida, T., Takeda, M., Shinohara, A., Arikawa, S.: Bit-parallel approach to approximate string matching in compressed texts. In: Proceedings of the 7th International Symposium on String Processing and Information Retrieval, pp. 221–228 (2000)

    Google Scholar 

  15. Myers, E.W.: A four-russian algorithm for regular expression pattern matching. J. ACM 39(2), 430–448 (1992)

    Article  MATH  Google Scholar 

  16. Navarro, G.: A guided tour to approximate string matching. ACM Comput. Surv. 33(1), 31–88 (2001)

    Article  Google Scholar 

  17. Navarro, G.: Regular expression searching on compressed text. J. Discrete Algorithms 1(5-6), 423–443 (2003)

    Article  MATH  MathSciNet  Google Scholar 

  18. Navarro, G., Kida, T., Takeda, M., Shinohara, A., Arikawa, S.: Faster approximate string matching over compressed text. In: Proceedings of the Data Compression Conference (DCC 2001), p. 459. IEEE Computer Society, Washington, DC, USA (2001)

    Chapter  Google Scholar 

  19. Navarro, G., Raffinot, M.: A general practical approach to pattern matching over Ziv-Lempel compressed text. Technical Report TR/DCC-98-12, Dept. of Computer Science, Univ. of Chile (1998)

    Google Scholar 

  20. Sellers, P.: The theory and computation of evolutionary distances: Pattern recognition. J. Algorithms 1, 359–373 (1980)

    Article  MATH  MathSciNet  Google Scholar 

  21. Thompson, K.: Programming techniques: Regular expression search algorithm. Commun. ACM 11, 419–422 (1968)

    Article  MATH  Google Scholar 

  22. Welch, T.A.: A technique for high-performance data compression. IEEE Computer 17(6), 8–19 (1984)

    Google Scholar 

  23. Ziv, J., Lempel, A.: A universal algorithm for sequential data compression. IEEE Trans. Inform. Theory 23(3), 337–343 (1977)

    Article  MATH  MathSciNet  Google Scholar 

  24. Ziv, J., Lempel, A.: Compression of individual sequences via variable-rate coding. IEEE Trans. Inform. Theory 24(5), 530–536 (1978)

    Article  MATH  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Bin Ma Kaizhong Zhang

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Bille, P., Fagerberg, R., Gørtz, I.L. (2007). Improved Approximate String Matching and Regular Expression Matching on Ziv-Lempel Compressed Texts. In: Ma, B., Zhang, K. (eds) Combinatorial Pattern Matching. CPM 2007. Lecture Notes in Computer Science, vol 4580. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-73437-6_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-73437-6_8

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-73436-9

  • Online ISBN: 978-3-540-73437-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics