Skip to main content

On the complexity of pattern matching for highly compressed two-dimensional texts

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1264))

Abstract

We consider the complexity of problems related to 2-dimensional texts (2d-texts) described succinctly. In a succinct description, larger rectangular sub-texts are defined in terms of smaller parts in a way similar to that of Lempel-Ziv compression for 1-dimensional texts, or in shortly described strings as in [9], or in hierarchical graphs described by context-free graph grammars. A given 2d-text T with many internal repetitions can have a hierarchical description (denoted Compress(T)) which is up to exponentially smaller and which can be the only part of the input for a pattern-matching algorithm which gives information about T. Such a hierarchical description is given in terms of a straight-line program, see [9] or, equivalently, a 2-dimensional grammar.

We consider compressed pattern-matching, where the input consists of a 2d-pattern P and of a hierarchical description of a 2d-text T, and fully compressed pattern-matching, where the input consists of hierarchical descriptions of both the pattern P and the text T. For 1-dimensional strings there exist polynomial-time deterministic algorithms for these problems, for similar types of succinct text descriptions [2, 6, 8, 9]. We show that the complexity dramatically increases in a 2-dimensional setting. For example, compressed 2d-matching is NP-complete, fully compressed 2d-matching is Σ p2 -complete, and testing a given occurrence of a two dimensional compressed pattern is co-NP-complete.

On the other hand, we give efficient algorithms for the related problems of randomized equality testing and testing for a given occurrence of an uncompressed pattern.

We also show the surprising fact that the compressed size of a subrectangle of a compressed 2d-text can grow exponentially, unlike the one dimensional case.

This research was partially supported by the DFG Grant KA 673/4-1

Research partially supported by National Science Foundation grant CCR-9503441. Part of this work was done while the author was visiting Institut Informatik V, Universität Bonn, Germany.

Supported by the grant KBN 8T11C01208.

Supported by the grant KBN 8T11C01208.

This is a preview of subscription content, log in via an institution.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. A.V. Aho, J.E. Hopcroft, and J.D. Ullman, The design and analysis of computer algorithms, Addison-Wesley, Reading, Mass., 1974.

    Google Scholar 

  2. A. Amir, G. Benson and M. Farach, Let sleeping files lie: pattern-matching in Z-compressed files, in SODA '94.

    Google Scholar 

  3. A. Amir, G. Benson, Efficient two dimensional compressed matching, Proc. of the 2nd IEEE Data Compression Conference 279–288 (1992).

    Google Scholar 

  4. A. Amir, G. Benson and M. Farach, Optimal two-dimensional compressed matching, in ICALP'94 pp. 215–225.

    Google Scholar 

  5. M. Crochemore and W. Rytter, Text Algorithms, Oxford University Press, New York (1994).

    Google Scholar 

  6. M. Farach and M. Thorup, String matching in Lempel-Ziv compressed strings, in STOC'95, pp. 703–712.

    Google Scholar 

  7. M.R. Garey and D.S. Johnson, Computers and Intractability: A Guide to the Theory of NP-Completeness. W.H. Freeman (1979).

    Google Scholar 

  8. L. Gasieniec, M. Karpiński, W. Plandowski and W. Rytter, Efficient Algorithms for Compressed Strings. in proceedings of the SWAT'96 (1996).

    Google Scholar 

  9. M. Karpinski, W. Rytter and A. Shinohara, Pattern-matching for strings with short description, in Combinatorial Pattern Matching, 1995.

    Google Scholar 

  10. D. Knuth, The Art of Computing, Vol. II: Seminumerical Algorithms. Second edition. Addison-Wesley, 1981.

    Google Scholar 

  11. A. Lempel and J. Ziv, On the complexity of finite sequences, IEEE Trans. on Inf. Theory 22, 75–81 (1976).

    Google Scholar 

  12. A. Lempel and J. Ziv, Compression of two-dimensional images sequences, Combinatorial algorithms on words (ed. A. Apostolico, Z. Galil) Springer Verlag (1985) 141–156.

    Google Scholar 

  13. R. Motwani, P. Raghavan, Randomized algorithms, Cambridge University Press 1995.

    Google Scholar 

  14. Papadimitriou, Ch. H., Computational complexity, Addison Wesley, Reading, Massachusetts, 1994.

    Google Scholar 

  15. W. Plandowski, Testing equivalence of morphisms on context-free languages, ESA'94, Lecture Notes in Computer Science 855, Springer-Verlag, 460–470 (1994).

    Google Scholar 

  16. J. Storer, Data compression: methods and theory, Computer Science Press, Rockville, Maryland, 1988.

    Google Scholar 

  17. R.E. Zippel, Probabilistic algorithms for sparse polynomials, in EUROSAM 79, Lecture Notes in Comp. Science 72, 216–226 (1979).

    Google Scholar 

  18. J. Ziv and A. Lempel, A universal algorithm for sequential data compression, IEEE Trans. on Inf. Theory vo. IT-23(3), 337–343, 1977.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Alberto Apostolico Jotun Hein

Rights and permissions

Reprints and permissions

Copyright information

© 1997 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Berman, P., Karpinski, M., Larmore, L.L., Plandowski, W., Rytter, W. (1997). On the complexity of pattern matching for highly compressed two-dimensional texts. In: Apostolico, A., Hein, J. (eds) Combinatorial Pattern Matching. CPM 1997. Lecture Notes in Computer Science, vol 1264. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-63220-4_48

Download citation

  • DOI: https://doi.org/10.1007/3-540-63220-4_48

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-63220-7

  • Online ISBN: 978-3-540-69214-0

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics