Skip to main content

The Sketching Complexity of Pattern Matching

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 3122))

Abstract

We address the problems of pattern matching and approximate pattern matching in the sketching model. We show that it is impossible to compress the text into a small sketch and use only the sketch to decide whether a given pattern occurs in the text. We also prove a sketch size lower bound for approximate pattern matching, and show it is tight up to a logarithmic factor.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Alon, N., Matias, Y., Szegedy, M.: The space complexity of approximating the frequency moments. Journal of Computer and System Sciences 58(1), 137–147 (1999)

    Article  MATH  MathSciNet  Google Scholar 

  2. Amir, A., Benson, G.: Efficient two-dimensional compressed matching. In: Proceedings of IEEE Data Compression Conference, DCC, pp. 279–288 (1992)

    Google Scholar 

  3. Amir, A., Benson, G., Farach, M.: Let sleeping files lie: Pattern matching in Z-compressed files. J. of Computer and System Sciences 52(2), 299–307 (1996)

    Article  MATH  MathSciNet  Google Scholar 

  4. Bar-Yossef, Z., Jayram, T.S., Krauthgamer, R., Kumar, R.: Approximating edit distance efficientl (2004) (manuscript)

    Google Scholar 

  5. Bar-Yossef, Z., Jayram, T.S., Kumar, R., Sivakumar, D.: Information theory methods in communication complexity. In: Proceedings of the 17th Annual IEEE Conference on Computational Complexity, pp. 93–102 (2002)

    Google Scholar 

  6. Batu, T., Ergün, F., Kilian, J., Magen, A., Raskhodnikova, S., Rubinfeld, R., Sami, R.: A sublinear algorithm for weakly approximating edit distance. In: Proceedings of the 35th Annual ACM Symposium on Theory of Computing, pp. 316–324 (2003)

    Google Scholar 

  7. Broder, A., Charikar, M., Frieze, A., Mitzenmacher, M.: Min-wise independent permutations. Journal of Computer and System Sciences 60(3), 630–659 (2000)

    Article  MATH  MathSciNet  Google Scholar 

  8. Broder, A., Glassman, S.C., Manasse, M.S., Zweig, G.: Syntactic clustering of the web. WWW6/Computer Networks 29(8-13), 1157–1166 (1997)

    Google Scholar 

  9. Charikar, M.: Similarity estimation techniques from rounding algorithms. In: Proceedings of the 34th Annual ACM Symposium on Theory of Computing, pp. 380–388 (2002)

    Google Scholar 

  10. Cover, T.M., Thomas, J.A.: Elements of Information Theory. John Wiley & Sons, Inc. Chichester (1991)

    Book  MATH  Google Scholar 

  11. de Moura, E., Navarro, G., Ziviani, N., Baeza-Yates, R.: Fast and flexible word searching on compressed text. ACM Transactions on Information Systems 18(2), 113–139 (2000)

    Article  Google Scholar 

  12. Farach, M., Thorup, M.: String matching in Lempel-Ziv compressed strings. Algorithmica 20(4), 388–404 (1998)

    Article  MATH  MathSciNet  Google Scholar 

  13. Feigenbaum, J., Ishai, Y., Malkin, T., Nissim, K., Strauss, M.J., Wright, R.N.: Secure multiparty computation of approximations. In: Orejas, F., Spirakis, P.G., van Leeuwen, J. (eds.) ICALP 2001. LNCS, vol. 2076, pp. 927–938. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  14. Feigenbaum, J., Kannan, S., Strauss, M.J., Viswanathan, M.: An approximate L1-difference algorithm for massive data streams. SIAM J. Comput. 32(1), 131–151 (2002)

    Article  MATH  MathSciNet  Google Scholar 

  15. Ferragina, P., Manzini, G.: Opportunistic data structures with applications. In: Proceedings of the 41st Annual Symposium on Foundations of Computer Science, pp. 390–398. IEEE Computer Society, Los Alamitos (2000)

    Chapter  Google Scholar 

  16. Indyk, P., Motwani, R.: Approximate nearest neighbors: Towards removing the curse of dimensionality. In: Proceedings of the 30th Annual ACM Symposium on Theory of Computing, STOC, pp. 604–613 (1998)

    Google Scholar 

  17. Karp, R.M., Rabin, M.O.: Efficient randomized pattern-matching algorithms. IBM Journal of Research and Development 31(2), 249–260 (1987)

    Article  MATH  MathSciNet  Google Scholar 

  18. Kremer, I., Nisan, N., Ron, D.: On randomized one-round communication complexity. Computational Complexity 8(1), 21–49 (1999)

    Article  MATH  MathSciNet  Google Scholar 

  19. Kushilevitz, E., Ostrovsky, R., Rabani, Y.: Efficient search for approximate nearest neighbor in high dimensional spaces. SIAM Journal on Computing 30(2), 457–474 (2000)

    Article  MATH  MathSciNet  Google Scholar 

  20. Lonardi, S.: Pattern matching pointers (2004), Available http://www.cs.ucr.edu/~stelo/pattern.html

  21. Manber, U.: A text compression scheme that allows fast searching directly in the compressed file. ACM Transactions on Information Systems 15(2), 124–136 (1997)

    Article  Google Scholar 

  22. Navarro, G., Tarhio, J.: Boyer-Moore string matching over Ziv-Lempel compressed text. In: Giancarlo, R., Sankoff, D. (eds.) CPM 2000. LNCS, vol. 1848, pp. 166–180. Springer, Heidelberg (2000)

    Chapter  Google Scholar 

  23. Newman, I.: Private vs. common random bits in communication complexity. Inf. Process. Lett. 39(2), 67–71 (1991)

    Article  MATH  Google Scholar 

  24. Shibata, Y., Matsumoto, T., Takeda, M., Shinohara, A., Arikawa, S.: A Boyer- Moore type algorithm for compressed pattern matching. In: Giancarlo, R., Sankoff, D. (eds.) CPM 2000. LNCS, vol. 1848, pp. 181–194. Springer, Heidelberg (2000)

    Chapter  Google Scholar 

  25. Yao, C.-C.: Lower bounds by probabilistic arguments. In: Proceedings of the 24th Annual IEEE Symposium on Foundations of Computer Science, pp. 420–428 (1983)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Bar-Yossef, Z., Jayram, T.S., Krauthgamer, R., Kumar, R. (2004). The Sketching Complexity of Pattern Matching. In: Jansen, K., Khanna, S., Rolim, J.D.P., Ron, D. (eds) Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques. RANDOM APPROX 2004 2004. Lecture Notes in Computer Science, vol 3122. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-27821-4_24

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-27821-4_24

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-22894-3

  • Online ISBN: 978-3-540-27821-4

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics