Skip to main content

Algorithms on Compressed Strings and Arrays

  • Conference paper
  • First Online:
SOFSEM’99: Theory and Practice of Informatics (SOFSEM 1999)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1725))

Abstract

We survey the complexity issues related to several algorithmic problems for compressed one- and two-dimensional texts without explicit decompression: pattern-matching, equality-testing, computation of regularities, subsegment extraction, language membership, and solvability of word equations. Our basic problem is one- and two-dimensional pattern-matching together with its variations. For some types of compression the pattern-matching problems are unfeasible (NP-hard), for other types they are solvable in polynomial time and we discuss how to reduce the degree of corresponding polynomials.

Supported by the grant KBN 8T11C03915.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. A. V. Aho, J. E. Hopcroft, and J. D. Ullman, The design and analysis of computer algorithms, Addison-Wesley, Reading, Mass., 1974.

    MATH  Google Scholar 

  2. A. Amir, G. Benson and M. Farach, Let sleping files lie: pattern-matching in Z-compressed files. Journal of Computer and System Sciences, 1996, Vol. 52, No. 2, pp. 299–307.

    Article  MathSciNet  Google Scholar 

  3. A. Amir, G. Benson, Efficient two dimensional compressed matching, Proceedings of the 2nd IEEE Data Compression Conference, pp. 279–288 (1992).

    Google Scholar 

  4. A. Amir, G. Benson and M. Farach, Optimal two-dimensional compressed matching, Journal of Algorithms, 24(2):354–379, August 1997.

    Article  MATH  MathSciNet  Google Scholar 

  5. Angluin D., Finding patterns common to a set of strings, J.C.S.S., 21(1), 46–62, 1980.

    Google Scholar 

  6. S. De Agostino, P-complete problems in data compression, Theoretical Computer Science 127, 181–186, 1994.

    Article  MATH  MathSciNet  Google Scholar 

  7. S. De Agostino, Pattern-matching in text compressed with the ID heuristic, in J. Atorer, and M. Cohn (Editors), Data Compression Conference 1998, IEEE Computer Society, pp. 113–118.

    Google Scholar 

  8. M. F. Barnsley, L. P. Hurd, Fractal image compression, A. K. Peters Ltd. 1993.

    Google Scholar 

  9. P. Berman, M. Karpinski, L. L. Larmore, W. Plandowski, and W. W. Rytter, On the Complexity of Pattern Matching for Highly Compressed Two-Dimensional Texts, Proceedings of the 8th Annual Symposium on Combinatorial Pattern Matching LNCS 1264, Edited by A. Apostolico and J. Hein, (1997), pp. 40–51.

    Google Scholar 

  10. M. Crochemore, F. Mignosi, A. Restivo, S. Salemi, Text compression using antidictionaries, ICALP 1999.

    Google Scholar 

  11. M. Crochemore and W. Rytter, Text Algorithms, Oxford University Press, New York (1994).

    MATH  Google Scholar 

  12. M. Crochemore, W. Rytter, Efficient parallel algorithms to test square-freeness and factorize strings, Information Processing Letters, 38 (1991) 57–60.

    Article  MATH  MathSciNet  Google Scholar 

  13. K. Culik and J. Karhumäki, Finite automata computing real functions, SIAM J. Comp (1994).

    Google Scholar 

  14. K. Culik and J. Kari, Image compression using weighted finite automata, Computer and Graphics 17, 305–313 (1993).

    Article  Google Scholar 

  15. K. Culik and J. Kari, Fractal image compression: theory and applications, (Ed. Y. Fisher), Springer Verlag 243–258 (1995).

    Google Scholar 

  16. D. Derencourt, J. Karhumäki, M. Letteux and A. Terlutte, On continuous functions computed by real functions, RAIRO Theor. Inform. Appl. 28, 387–404 (1994).

    Google Scholar 

  17. S. Eilenberg, Automata, Languages and Machines, Vol.A, Academic Press, New York (1974).

    MATH  Google Scholar 

  18. M. Farach and M. Thorup, String matching in Lempel-Ziv compressed strings, Proceedings of the 27th Annual Symposium on the Theory of Computing (1995), pp. 703–712.

    Google Scholar 

  19. M. Farach, S. Muthukrishnan, Optimal Parallel Dictionary Matching and Compression SPAA 1995.

    Google Scholar 

  20. J. Karhumaki, W. Plandowski, W. Rytter, Pattern matching for images generated by finite automata, FCT’97, in LNCS Springer Verlag 1997.

    Google Scholar 

  21. M. Gu, M. Farach, R. Beigel, An Efficient Algorithm for Dynamic Text Indexing (SODA’ 94).

    Google Scholar 

  22. M. R. Garey and D. S. Johnson, Computers and Intractability: A Guide to the Theory of NP-Completeness, Freeman, New York (1979).

    MATH  Google Scholar 

  23. L. Gąsieniec, M. Karpinski, W. Plandowski and W. Rytter, Efficient Algorithms for Lempel-Ziv Encoding, Proceedings of the 5th Scandinavian Workshop on Algorithm Theory. Springer-Verlag (1996).

    Google Scholar 

  24. L. Gasieniec, M. Karpinksi, W. Plandowski, W. Rytter, Randomized algorithms for compressed texts: the finger-print approach, Combinatorial Pattern Matching 1996, Lecture Notes in Comp. Science, Springer Verlag 1996.

    Google Scholar 

  25. L. Gasieniec, W. Rytter, Almost optimal fully compressed LZW-matching,in Data Compression Conference, IEEE Computer Society 1999.

    Google Scholar 

  26. L. Gasieniec, A. Gibbons, W. Rytter, The parallel complexity of pattern-searching in highly compressed texts, in MFCS 1999.

    Google Scholar 

  27. A. Gibbons, W. Rytter, Efficient parallel algorithms, Cambridge University Press 1988.

    Google Scholar 

  28. O. H. Ibarra and S. Moran, Probabilistic algorithms for deciding equivalence of straight-line programs, JACM 30 (1983), pp. 217–228.

    Article  MATH  MathSciNet  Google Scholar 

  29. J. Karhumaki, W. Plandowski, W. Rytter, The compression of subsegments of compressed images, in Combinatorial Pattern Matching 1999.

    Google Scholar 

  30. Karhumäki J., Mignosi F., Plandowski W., The expressibility of languages and relations by word equations, in ICALP’97, LNCS 1256, 98–109, 1997.

    Google Scholar 

  31. R. M. Karp, Reducibility among combinatorial problems, in “Complexity of Computer Computations”, Plemum Press, New York, 1972 (Editors R. E. Miller and J. W. Thatcher).

    Google Scholar 

  32. T. Kida, M. Takeda, A. Shinohara, S. Arikawa, Sfift-and approach to pattern matching in LZW compressed text, Combinatorial Pattern-Matching 1999, LNCS 1645, pp.1–13.

    Chapter  Google Scholar 

  33. T. Kida, M. Takeda, A. Shinohara, M. Miyazaki, S. Arikawa, Multiple patter-matching in LZW compressed text, in J. Atorer, and M. Cohn (Editors), Data Compression Conference 1998, IEEE Computer Society, pp. 103–112.

    Google Scholar 

  34. M. Karpinski, W. Rytter and A. Shinohara, Pattern-matching for strings with short description, Nordic Journal of Computing, 4(2):172–186, 1997.

    MATH  MathSciNet  Google Scholar 

  35. J. Kari, P. Franti, Arithmetic coding of weighted nite automata,,RAIRO Theor. Inform. Appl. 28 343–360 (1994).

    MATH  MathSciNet  Google Scholar 

  36. R. M. Karp and M. Rabin, Efficient randomized pattern matching algorithms,IBM Journal of Research and Dev. 31, pp. 249–260 (1987).

    Google Scholar 

  37. D. Knuth, The Art of Computing, Vol. II: Seminumerical Algorithms. Second edition. Addison-Wesley (1981).

    Google Scholar 

  38. A. Lempel and J. Ziv, On the complexity of finite sequences, IEEE Trans. on Inf. Theory 22 (1976) pp. 75–81.

    Article  MATH  MathSciNet  Google Scholar 

  39. A. Lempel and J. Ziv, Compression of two-dimensional images sequences,Combinatorial algorithms on words (Ed. A. Apostolico, Z. Galil), Springer-Verlag (1985) pp.141–156.

    Google Scholar 

  40. M. Lothaire, Combinatorics on Words. Addison-Wesley (1993).

    Google Scholar 

  41. Makanin, G. S., The problem of solvability of equations in a free semigroup, Mat. Sb., Vol. 103,(145), 147–233, 1977. English transl. in Math. U.S.S.R. Sb. Vol 32,1977.

    MathSciNet  Google Scholar 

  42. U. Manber, A text compression scheme that allows fast searching directly in the compressed file, ACM Transactions on Information Systems, 15(2), pp.124–136, 1997.

    Article  Google Scholar 

  43. M. Miyazaki, A. Shinohara, M. Takeda, An improved pattern-matching algorithm for strings in terms of straight-line programs, Combinatorial Pattern-Matching 1997, LNCS 1264, pp. 1–11.

    Google Scholar 

  44. R. Motwani, P. Raghavan, Randomized Algorithms, Cambridge University Press (1995).

    Google Scholar 

  45. E. de Moura, G, Navarro, N. Ziviani, R. Baeza-Yates, Direct pattern-matching on compressed text, in SPIRE, pp.90–95, IEEE CS Press, 1998.

    Google Scholar 

  46. E. de Moura, G, Navarro, N. Ziviani, R. Baeza-Yates, Fast sequential searching on compressed texts allowing errors, 21st Annual Int. ACM SIGIR Conference on Research and Development in Information retrieval, pp. 298–306, York Press 1998.

    Google Scholar 

  47. G. Navarro, M. Raffinot, A general practical approach to pattern-matching over Zil-Lempel compressed text, Combinatorial Pattern-Matching 1999, LNCS 1645, pp. 14–36.

    Chapter  Google Scholar 

  48. Ch. H. Papadimitriou, Computational Complexity, Addison Wesley (1994).

    Google Scholar 

  49. H. O. Peitgen, P. H. Richter, The beauty of plants, Springer-Verlag 1986.

    Google Scholar 

  50. W. Plandowski, Testing equivalence of morphisms on context-free languages, Proceedings of the 2nd Annual European Symposium on Algorithms (ESA’94), LNCS 855, Springer-Verlag (1994), pp. 460–470.

    Google Scholar 

  51. W. Plandowski, Solvability of word equations with constants is in NEXPTIME, STOC 1999.

    Google Scholar 

  52. W. Plandowski, Solvability of word equations with constants is in P-SPACE, FOCS 1999.

    Google Scholar 

  53. W. Plandowski, W. Rytter, Complexity of compressed recognition of formal languages, in “Jewels forever”, Springer Verlag 1999 (Ed. J. Karhumaki).

    Google Scholar 

  54. W. Plandowski, W. Rytter, Applying Lempel-Ziv encodings to the solution of word equations, ICALP 1998.

    Google Scholar 

  55. J. Schwartz, Fast probabilistic algorithms for verification of polynomial identities, J. ACM 27 (1980) pp. 701–717.

    Article  MATH  Google Scholar 

  56. J. Shallit, D. Swart, An efficient algorithm for computing the i-th letter of ∅n(a), SODA 1999.

    Google Scholar 

  57. Y. Shibata, M. Takeda, A. Shinohara, S. Arikawa,Pattern-matching in text compressed by using antidictionaries,Combinatorial Pattern-Matching 1999,LNCS 1645,pp.37.

    Chapter  Google Scholar 

  58. J. Storer, Data compression: methods and theory, Computer Science Press (1988).

    Google Scholar 

  59. R. E. Zippel, Probabilistic algorithms for sparse polynomials, Proceedings of the International Symposium on Symbolic and Algebraic Manipulation (EUROSAM’ 79) LNCS 72, Springer-Verlag (1979), pp. 216–226.

    Google Scholar 

  60. J. Ziv and A. Lempel, A Universal algorithm for sequential data compression, IEEE Transactions on Information Theory IT-23 (1977), pp. 337–343.

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1999 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Rytter, W. (1999). Algorithms on Compressed Strings and Arrays. In: Pavelka, J., Tel, G., Bartošek, M. (eds) SOFSEM’99: Theory and Practice of Informatics. SOFSEM 1999. Lecture Notes in Computer Science, vol 1725. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-47849-3_3

Download citation

  • DOI: https://doi.org/10.1007/3-540-47849-3_3

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-66694-3

  • Online ISBN: 978-3-540-47849-2

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics