Abstract
Given two strings (a text t of length n and a pattern p) and a natural number w, window subsequence problems consist in deciding whether p occurs as a subsequence of t and/or finding the number of size (at most) w windows of text t which contain pattern p as a subsequence, i.e. the letters of pattern p occur in the text window, in the same order as in p, but not necessarily consecutively (they may be interleaved with other letters). We are searching for subsequences in a text which is compressed using Lempel-Ziv-like compression algorithms, without decompressing the text, and we would like our algorithms to be almost optimal, in the sense that they run in time O(m) where m is the size of the compressed text. The pattern is uncompressed (because the compression algorithms are evolutive: various occurrences of a same pattern look different in the text).
Support by grants INTAS–04-77-7173 and NSh–2203-2003-1 is gratefully acknowledged.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Amir, A., Benson, G., Farach, M.: Let sleeping files lie: pattern matching in Z–compressed files. J. Comput. Syst. Sci. 52(2), 299–307 (1996)
Berman, P., Karpinski, M., Larmore, L., Plandowski, W., Rytter, W.: On the Complexity of Pattern Matching for Highly Compressed Two-Dimensional Texts. Journal of Computer and Systems Science 65(2), 332–350 (2002)
Crochemore, M.: String-matching with constraints. In: Koubek, V., Janiga, L., Chytil, M.P. (eds.) MFCS 1988. LNCS, vol. 324, pp. 44–58. Springer, Heidelberg (1988)
Gasieniec, L., Karpinski, M., Plandowski, W., Rytter, W.: Efficient Algorithms for Lempel-Ziv Encoding (Extended Abstract). In: Karlsson, R., Lingas, A. (eds.) SWAT 1996. LNCS, vol. 1097, pp. 392–403. Springer, Heidelberg (1996)
Genest, B., Muscholl, A.: Pattern Matching and Membership for Hierarchical Message Sequence Charts. In: Rajsbaum, S. (ed.) LATIN 2002. LNCS, vol. 2286, pp. 326–340. Springer, Heidelberg (2002)
Ziv, G., Lempel, A.: A universal algorithm for sequential data compresssion. IEEE Transactions on Information Theory 23(3), 337–343 (1977)
Ziv, G., Lempel, A.: Compresssion of individual sequences via variable-rate coding. IEEE Transactions on Information Theory 24, 530–536 (1978)
Lifshits, Y.: On the computational complexity of embedding of compressed texts, St.Petersburg State University Diploma thesis, (2005), http://logic.pdmi.ras.ru/~yura/en/diplomen.pdf
Lifshits, Y., Lohrey, M.: Querying and Embedding Compressed Texts (to appear, 2005)
Lohrey, M.: Word problems on compressed word. In: Díaz, J., Karhumäki, J., Lepistö, A., Sannella, D. (eds.) ICALP 2004. LNCS, vol. 3142, pp. 906–918. Springer, Heidelberg (2004)
Mannila, H.: Local and Global Methods in Data Mining: Basic Techniques and open Problems. In: Widmayer, P., Triguero, F., Morales, R., Hennessy, M., Eidenbenz, S., Conejo, R. (eds.) ICALP 2002. LNCS, vol. 2380, pp. 57–68. Springer, Heidelberg (2002)
Markey, N., Schnoebelen, P.: A PTIME-complete matching problem for SLP-compressed words. Information Processing Letters 90(1), 3–6 (2004)
Matiyasevich, Y.: Real-time recognition of the inclusion relation. Zapiski Nauchnykh Leningradskovo Otdeleniya Mat. Inst. Steklova Akad. Nauk SSSR 20, 104–114 (1971); Translated into English, Journal of Soviet Mathematics 1, 64–70 (1973), http://logic.pdmi.ras.ru/~yumat/Journal
Rytter, W.: Algorithms on compressed strings and arrays. In: Bartosek, M., Tel, G., Pavelka, J. (eds.) SOFSEM 1999. LNCS, vol. 1725, pp. 48–65. Springer, Heidelberg (1999)
Rytter, W.: Application of Lempel-Ziv factorization to the approximation of grammar-based compression. TCS 1-3(299), 763–774 (2003)
Slissenko, A.: String-matching in real time. In: Winkowski, J. (ed.) MFCS 1978. LNCS, vol. 64, pp. 493–496. Springer, Heidelberg (1978)
Welch, T.: A technique for high performance data compresssion. Computer, 8–19 (June 1984)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Cégielski, P., Guessarian, I., Lifshits, Y., Matiyasevich, Y. (2006). Window Subsequence Problems for Compressed Texts. In: Grigoriev, D., Harrison, J., Hirsch, E.A. (eds) Computer Science – Theory and Applications. CSR 2006. Lecture Notes in Computer Science, vol 3967. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11753728_15
Download citation
DOI: https://doi.org/10.1007/11753728_15
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-34166-6
Online ISBN: 978-3-540-34168-0
eBook Packages: Computer ScienceComputer Science (R0)