Skip to main content

Window Subsequence Problems for Compressed Texts

  • Conference paper
Computer Science – Theory and Applications (CSR 2006)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 3967))

Included in the following conference series:

Abstract

Given two strings (a text t of length n and a pattern p) and a natural number w, window subsequence problems consist in deciding whether p occurs as a subsequence of t and/or finding the number of size (at most) w windows of text t which contain pattern p as a subsequence, i.e. the letters of pattern p occur in the text window, in the same order as in p, but not necessarily consecutively (they may be interleaved with other letters). We are searching for subsequences in a text which is compressed using Lempel-Ziv-like compression algorithms, without decompressing the text, and we would like our algorithms to be almost optimal, in the sense that they run in time O(m) where m is the size of the compressed text. The pattern is uncompressed (because the compression algorithms are evolutive: various occurrences of a same pattern look different in the text).

Support by grants INTAS–04-77-7173 and NSh–2203-2003-1 is gratefully acknowledged.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Amir, A., Benson, G., Farach, M.: Let sleeping files lie: pattern matching in Z–compressed files. J. Comput. Syst. Sci. 52(2), 299–307 (1996)

    Article  MathSciNet  MATH  Google Scholar 

  2. Berman, P., Karpinski, M., Larmore, L., Plandowski, W., Rytter, W.: On the Complexity of Pattern Matching for Highly Compressed Two-Dimensional Texts. Journal of Computer and Systems Science 65(2), 332–350 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  3. Crochemore, M.: String-matching with constraints. In: Koubek, V., Janiga, L., Chytil, M.P. (eds.) MFCS 1988. LNCS, vol. 324, pp. 44–58. Springer, Heidelberg (1988)

    Chapter  Google Scholar 

  4. Gasieniec, L., Karpinski, M., Plandowski, W., Rytter, W.: Efficient Algorithms for Lempel-Ziv Encoding (Extended Abstract). In: Karlsson, R., Lingas, A. (eds.) SWAT 1996. LNCS, vol. 1097, pp. 392–403. Springer, Heidelberg (1996)

    Chapter  Google Scholar 

  5. Genest, B., Muscholl, A.: Pattern Matching and Membership for Hierarchical Message Sequence Charts. In: Rajsbaum, S. (ed.) LATIN 2002. LNCS, vol. 2286, pp. 326–340. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  6. Ziv, G., Lempel, A.: A universal algorithm for sequential data compresssion. IEEE Transactions on Information Theory 23(3), 337–343 (1977)

    Article  MathSciNet  MATH  Google Scholar 

  7. Ziv, G., Lempel, A.: Compresssion of individual sequences via variable-rate coding. IEEE Transactions on Information Theory 24, 530–536 (1978)

    Article  MathSciNet  MATH  Google Scholar 

  8. Lifshits, Y.: On the computational complexity of embedding of compressed texts, St.Petersburg State University Diploma thesis, (2005), http://logic.pdmi.ras.ru/~yura/en/diplomen.pdf

  9. Lifshits, Y., Lohrey, M.: Querying and Embedding Compressed Texts (to appear, 2005)

    Google Scholar 

  10. Lohrey, M.: Word problems on compressed word. In: Díaz, J., Karhumäki, J., Lepistö, A., Sannella, D. (eds.) ICALP 2004. LNCS, vol. 3142, pp. 906–918. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  11. Mannila, H.: Local and Global Methods in Data Mining: Basic Techniques and open Problems. In: Widmayer, P., Triguero, F., Morales, R., Hennessy, M., Eidenbenz, S., Conejo, R. (eds.) ICALP 2002. LNCS, vol. 2380, pp. 57–68. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  12. Markey, N., Schnoebelen, P.: A PTIME-complete matching problem for SLP-compressed words. Information Processing Letters 90(1), 3–6 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  13. Matiyasevich, Y.: Real-time recognition of the inclusion relation. Zapiski Nauchnykh Leningradskovo Otdeleniya Mat. Inst. Steklova Akad. Nauk SSSR 20, 104–114 (1971); Translated into English, Journal of Soviet Mathematics 1, 64–70 (1973), http://logic.pdmi.ras.ru/~yumat/Journal

  14. Rytter, W.: Algorithms on compressed strings and arrays. In: Bartosek, M., Tel, G., Pavelka, J. (eds.) SOFSEM 1999. LNCS, vol. 1725, pp. 48–65. Springer, Heidelberg (1999)

    Chapter  Google Scholar 

  15. Rytter, W.: Application of Lempel-Ziv factorization to the approximation of grammar-based compression. TCS 1-3(299), 763–774 (2003)

    Article  MATH  Google Scholar 

  16. Slissenko, A.: String-matching in real time. In: Winkowski, J. (ed.) MFCS 1978. LNCS, vol. 64, pp. 493–496. Springer, Heidelberg (1978)

    Chapter  Google Scholar 

  17. Welch, T.: A technique for high performance data compresssion. Computer, 8–19 (June 1984)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Cégielski, P., Guessarian, I., Lifshits, Y., Matiyasevich, Y. (2006). Window Subsequence Problems for Compressed Texts. In: Grigoriev, D., Harrison, J., Hirsch, E.A. (eds) Computer Science – Theory and Applications. CSR 2006. Lecture Notes in Computer Science, vol 3967. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11753728_15

Download citation

  • DOI: https://doi.org/10.1007/11753728_15

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-34166-6

  • Online ISBN: 978-3-540-34168-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics