Skip to main content

Finding Ambiguous Patterns on Grammar Compressed String

  • Conference paper
  • First Online:
New Frontiers in Artificial Intelligence (JSAI-isAI 2014)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9067))

Included in the following conference series:

  • 725 Accesses

Abstract

Given a grammar compressed string S, a pattern P, and \(d\ge 0\), we consider the problem of finding all occurrences of \(P'\) in S such that \(d(P,P')\le d\) with respect to Hamming distance. We propose an algorithm for this problem in \(O(\lg \lg n \lg ^* N(m+d\ occ_d \lg \frac{m}{d}\lg N))\) time, where \(N=|S|\), \(m=|P|\), n is the number of variables in the grammar compression, and \(occ_d\) is the frequency of an evidence of a substring of P. We implement this algorithm and compare with a naive filtering on the grammar compression.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    In practical sense, \(\lg ^*N\) is a constant for sufficiently large N.

References

  1. Charikar, M., Lehman, E., Liu, D., Panigrahy, R., Prabhakaran, M., Sahai, A., Shelat, A.: The smallest grammar problem. IEEE Trans. Inf. Theory 51(7), 2554–2576 (2005)

    Article  MathSciNet  Google Scholar 

  2. Claude, F., Navarro, G.: Self-indexed grammar-based compression. Fundam. Inf. 111(3), 313–337 (2011)

    MathSciNet  MATH  Google Scholar 

  3. Cormode, G., Muthukrishnan, S.: The string edit distance matching problem with moves. ACM Trans. Algor. 3(1): Article 2 (2007). doi: 10.1145/1186810.1186812

  4. Gagie, T., Gawrychowski, P., Kärkkäinen, J., Nekrich, Y., Puglisi, S.J.: A faster grammar-based self-index. In: Dediu, A.-H., Martín-Vide, C. (eds.) LATA 2012. LNCS, vol. 7183, pp. 240–251. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  5. Jeż, A.: Approximation of grammar-based compression via recompression. In: Fischer, J., Sanders, P. (eds.) CPM 2013. LNCS, vol. 7922, pp. 165–176. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  6. Karpinski, M., Rytter, W., Shinohara, A.: An efficient pattern-matching algorithm for strings with short descriptions. Nordic J. Comput. 4(2), 172–186 (1997)

    MathSciNet  MATH  Google Scholar 

  7. Kreft, S., Navarro, G.: On compressing and indexing repetitive sequences. Theor. Comput. Sci. 483, 115–133 (2013)

    Article  MathSciNet  Google Scholar 

  8. Larsson, N.J., Moffat, A.: Offline dictionary-based compression. Proc. IEEE 88(11), 1722–1732 (2000)

    Article  Google Scholar 

  9. Maruyama, S., Nakahara, M., Kishiue, N., Sakamoto, H.: ESP-Index: a compressed index based on edit-sensitive parsing. J. Discrete Algorithms 18, 100–112 (2013)

    Article  MathSciNet  Google Scholar 

  10. Maruyama, S., Sakamoto, H., Takeda, M.: An online algorithm for lightweight grammar-based compression. Algorithms 5(2), 213–235 (2012)

    Article  MathSciNet  Google Scholar 

  11. Maruyama, S., Tabei, Y., Sakamoto, H., Sadakane, K.: Fully-online grammar compression. In: Kurland, O., Lewenstein, M., Porat, E. (eds.) SPIRE 2013. LNCS, vol. 8214, pp. 218–229. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  12. Navarro, G.: Implementing the LZ-index: theory versus practice. ACM J. Exp. Algorithmics 13, 25–65 (2008)

    MathSciNet  Google Scholar 

  13. Rytter, W.: Application of Lempel-Ziv factorization to the approximation of grammar-based compression. Theor. Comput. Sci. 302(1–3), 211–222 (2003)

    Article  MathSciNet  Google Scholar 

  14. Sakamoto, H.: A fully linear-time approximation algorithm for grammar-based compression. J. Discrete Algorithms 3(2–4), 416–430 (2005)

    Article  MathSciNet  Google Scholar 

  15. Takabatake, Y., Tabei, Y., Sakamoto, H.: Improved ESP-index: a practical self-index for highly repetitive texts. In: Gudmundsson, J., Katajainen, J. (eds.) SEA 2014. LNCS, vol. 8504, pp. 338–350. Springer, Heidelberg (2014)

    Chapter  Google Scholar 

  16. Ziv, J., Lempel, A.: Compression of individual sequences via variable-rate coding. IEEE Trans. Inf. Theory 24(5), 530–536 (1978)

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hiroshi Sakamoto .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Maeda, K., Takabatake, Y., Tabei, Y., Sakamoto, H. (2015). Finding Ambiguous Patterns on Grammar Compressed String. In: Murata, T., Mineshima, K., Bekki, D. (eds) New Frontiers in Artificial Intelligence. JSAI-isAI 2014. Lecture Notes in Computer Science(), vol 9067. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-48119-6_25

Download citation

  • DOI: https://doi.org/10.1007/978-3-662-48119-6_25

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-662-48118-9

  • Online ISBN: 978-3-662-48119-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics