Skip to main content

Pattern-Based Data Compression

  • Conference paper
Book cover MICAI 2004: Advances in Artificial Intelligence (MICAI 2004)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2972))

Included in the following conference series:

Abstract

Most modern lossless data compression techniques used today, are based in dictionaries. If some string of data being compressed matches a portion previously seen, then such string is included in the dictionary and its reference is included every time it appears. A possible generalization of this scheme is to consider not only strings made of consecutive symbols, but more general patterns with gaps between its symbols. The main problems with this approach are the complexity of pattern discovery algorithms and the complexity for the selection of a good subset of patterns. In this paper we address the last of these problems. We demonstrate that such problem is NP-complete and we provide some preliminary results about heuristics that points to its solution.

Categories and Subject Descriptors: E.4 [Coding and Information Theory]–data compaction and compression; F.2.2 [Analysis of Algorithms and Problem Complexity]: Nonnumerical Problems; I.2.8 [Artificial Intelligence]: Problem Solving, Control Methods, and Search–heuristic methods.

General Terms: Algorithms, Theory

Additional Key Words and Phrases: Genetic algorithms, optimization, NP-hardness

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Burkhardt, S., Kärkkäinen, J.: Better Filtering with Gapped q-Grams. In: Amir, A., Landau, G.M. (eds.) CPM 2001. LNCS, vol. 2089, pp. 73–85. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  2. De Jong, K., Spears, W.M.: Using genetic algorithms to solve NP-complete problems. In: Schaffer, J.D. (ed.) Proceedings of the Third International Conference on Genetic Algorithms, pp. 124–132 (1989), http://citeseer.nj.nec.com/dejong89using.html

  3. Hao, J.-K., Lardeux, J.F., Saubion, F.: Evolutionary Computing for the Satisfiability Problem. In: Raidl, G.R., Cagnoni, S., Cardalda, J.J.R., Corne, D.W., Gottlieb, J., Guillot, A., Hart, E., Johnson, C.G., Marchiori, E., Meyer, J.-A., Middendorf, M. (eds.) EvoIASP 2003, EvoWorkshops 2003, EvoSTIM 2003, EvoROB/EvoRobot 2003, EvoCOP 2003, EvoBIO 2003, and EvoMUSART 2003. LNCS, vol. 2611, pp. 259–268. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  4. Klein, S.T.: Improving Static Compression Schemes by Alphabet Extension. In: Giancarlo, R., Sankoff, D. (eds.) CPM 2000. LNCS, vol. 1848, pp. 210–221. Springer, Heidelberg (2000)

    Chapter  Google Scholar 

  5. Kuri, A.: A universal Eclectic Genetic Algorithm for Constrained Optimization. In: Proceedings 6th European Congress on Intelligent Techniques & Soft Computing, EUFIT 1998, pp. 518–522 (1998)

    Google Scholar 

  6. Kuri, A.: A Methodology for the Statistical Characterization of Genetic Algorithms. In: Coello Coello, C.A., de Albornoz, Á., Sucar, L.E., Battistutti, O.C. (eds.) MICAI 2002. LNCS (LNAI), vol. 2313, pp. 79–89. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  7. Shannon, C.E.: A Mathematical Theory of Communication. The Bell System Technical Journal 27, 379–423, 623–656 (1948)

    Google Scholar 

  8. Storer, J., Szymanski, T.: Data Compression via Textual Substitution. JACM 29(4), 928–951 (1982)

    Article  MATH  MathSciNet  Google Scholar 

  9. Vilo, J.: Pattern Discovery from Biosequences, PhD Thesis, Technical Report A-2002-3, Department of Computer Science, University of Helsinki (2002)

    Google Scholar 

  10. Ziv, J., Lempel, A.: A Universal Algorithm for Sequential Data Compression. IEEE Transactions on Information Theory 23(3), 337–343 (1977)

    Article  MATH  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Kuri, Á., Galaviz, J. (2004). Pattern-Based Data Compression. In: Monroy, R., Arroyo-Figueroa, G., Sucar, L.E., Sossa, H. (eds) MICAI 2004: Advances in Artificial Intelligence. MICAI 2004. Lecture Notes in Computer Science(), vol 2972. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-24694-7_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-24694-7_1

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-21459-5

  • Online ISBN: 978-3-540-24694-7

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics