Skip to main content

Dictionary Matching with One Gap

  • Conference paper
Combinatorial Pattern Matching (CPM 2014)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 8486))

Included in the following conference series:

Abstract

The dictionary matching with gaps problem is to preprocess a dictionary D of d gapped patterns P 1,…,P d over alphabet Σ, where each gapped pattern P i is a sequence of subpatterns separated by bounded sequences of don’t cares. Then, given a query text T of length n over alphabet Σ, the goal is to output all locations in T in which a pattern P i  ∈ D, 1 ≤ i ≤ d, ends. There is a renewed current interest in the gapped matching problem stemming from cyber security. In this paper we solve the problem where all patterns in the dictionary have one gap with at least α and at most β don’t cares, where α and β are given parameters. Specifically, we show that the dictionary matching with a single gap problem can be solved in either O(dlogd + |D|) time and O(dlogε d + |D|) space, and query time O(n(β − α)loglogd log2 min { d, log|D| } + occ), where occ is the number of patterns found, or preprocessing time: O(d 2·ovr + |D|), where ovr is the maximal number of subpatterns including each other as a prefix or as a suffix, space: O(d 2 + |D|), and query time O(n(β − α) + occ), where occ is the number of patterns found. As far as we know, this is the best solution for this setting of the problem, where many overlaps may exist in the dictionary.

This research was supported by the Kabarnit Cyber consortium funded by the Chief Scientist in the Israeli Ministry of Economy under the Magnet Program.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Aho, A.V., Corasick, M.J.: Efficient string matching: an aid to bibliographic search. Comm. ACM 18(6), 333–340 (1975)

    Article  MATH  MathSciNet  Google Scholar 

  2. Amir, A., Calinescu, G.: Alphabet independent and dictionary scaled matching. J. of Algorithms 36, 34–62 (2000)

    Article  MATH  MathSciNet  Google Scholar 

  3. Amir, A., Farach, M., Giancarlo, R., Galil, Z., Park, K.: Dynamic dictionary matching. Journal of Computer and System Sciences 49(2), 208–222 (1994)

    Article  MATH  MathSciNet  Google Scholar 

  4. Amir, A., Farach, M., Idury, R.M., La Poutré, J.A., Schäffer, A.A.: Improved dynamic dictionary matching. Information and Computation 119(2), 258–282 (1995)

    Article  MATH  MathSciNet  Google Scholar 

  5. Amir, A., Keselman, D., Landau, G.M., Lewenstein, M., Lewenstein, N., Rodeh, M.: Indexing and dictionary matching with one error. In: Dehne, F., Gupta, A., Sack, J.-R., Tamassia, R. (eds.) WADS 1999. LNCS, vol. 1663, pp. 181–192. Springer, Heidelberg (1999)

    Chapter  Google Scholar 

  6. Bille, P., Gørtz, I.L., Vildhøj, H.W., Wind, D.K.: String matching with variable length gaps. Theoretical Computer Science (443), 25–34 (2012)

    Google Scholar 

  7. Bille, P., Thorup, M.: Faster regular expression matching. In: Albers, S., Marchetti-Spaccamela, A., Matias, Y., Nikoletseas, S., Thomas, W. (eds.) ICALP 2009, Part I. LNCS, vol. 5555, pp. 171–182. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  8. Bille, P., Thorup, M.: Regular expression matching with multi-strings and intervals. In: Proc. 21st Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pp. 1297–1308 (2010)

    Google Scholar 

  9. Brodal, G.S., Gasieniec, L.: Approximate dictionary queries. In: Hirschberg, D.S., Meyers, G. (eds.) CPM 1996. LNCS, vol. 1075, pp. 65–74. Springer, Heidelberg (1996)

    Chapter  Google Scholar 

  10. Cole, R., Gottlieb, L., Lewenstein, M.: Dictionary matching and indexing with errors and don’t cares. In: Proc. 36th Annual ACM Symposium on the Theory of Computing (STOC), pp. 91–100. ACM Press (2004)

    Google Scholar 

  11. Chan, T.M., Larsen, K.G., Pǎtraşcu, M.: Orthogonal range searching on the ram, revisited. In: Proc. 27th ACM Symposium on Computational Geometry (SoCG), pp. 1–10 (2011)

    Google Scholar 

  12. Fredriksson, K., Grabowski, S.: Efficient algorithms for pattern matching with general gaps, character classes and transposition invariance. Inf. Retr. 11(4), 338–349 (2008)

    Article  Google Scholar 

  13. Haapasalo, T., Silvasti, P., Sippu, S., Soisalon-Soininen, E.: Online Dictionary Matching with Variable-Length Gaps. In: Pardalos, P.M., Rebennack, S. (eds.) SEA 2011. LNCS, vol. 6630, pp. 76–87. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  14. Hofmann, K., Bucher, P., Falquet, L., Bairoch, A.: The PROSITE database. Nucleic Acids Res. (27), 215–219 (1999)

    Google Scholar 

  15. Krishnamurthy, M., Seagren, E.S., Alder, R., Bayles, A.W., Burke, J., Carter, S., Faskha, E.: How to Cheat at Securing Linux. Syngress Publishing, Inc., Elsevier, Inc., 30 Corporate Dr., Burlington, MA 01803 (2008), e-edition: http://www.sciencedirect.com/science/book/9781597492072

  16. Kucherov, G., Rusinowitch, M.: Matching a set of strings with variable length don’t cares. Theoret. Comput. Sci. 178(12), 129–154 (1997)

    Article  MATH  MathSciNet  Google Scholar 

  17. McCreight, E.M.: A Space-Economical Suffix Tree Construction Algorithm. Journal of the ACM 23(2), 262–272 (1976)

    Article  MATH  MathSciNet  Google Scholar 

  18. Morgante, M., Policriti, A., Vitacolonna, N., Zuccolo, A.: Structured motifs search. J. Comput. Bio. 12(8), 1065–1082 (2005)

    Article  Google Scholar 

  19. Myers, G.: A four-russian algorithm for regular expression pattern matching. J. ACM 39(2), 430–448 (1992)

    Article  MATH  Google Scholar 

  20. Myers, G., Mehldau, G.: A system for pattern matching applications on biosequences. CABIOS 9(3), 299–314 (1993)

    Google Scholar 

  21. Naa, J.C., Apostolico, A., Iliopoulos, C.S., Park, K.: Truncated suffix trees and their application to data compression. Theoretical Computer Science 304(3), 87–101 (2003)

    Article  MathSciNet  Google Scholar 

  22. Navarro, G., Raffinot, M.: Fast and simple character classes and bounded gaps pattern matching, with applications to protein searching. J. Comput. Bio. 10(6), 903–923 (2003)

    Article  Google Scholar 

  23. Navarro, G., Raffinot, M.: New techniques for regular expression searching. Algorithmica 41(2), 89–116 (2004)

    Article  MathSciNet  Google Scholar 

  24. Rahman, M.S., Iliopoulos, C.S., Lee, I., Mohamed, M., Smyth, W.F.: Finding patterns with variable length gaps or don’t cares. In: Chen, D.Z., Lee, D.T. (eds.) COCOON 2006. LNCS, vol. 4112, pp. 146–155. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  25. Ukkonen, E.: On-line construction of suffix trees. Algorithmica 14(3), 249–260 (1995)

    Article  MATH  MathSciNet  Google Scholar 

  26. van Emde Boas, P.: Preserving order in a forest in less than logarithmic time. In: Proceedings of the 16th Annual Symposium on Foundations of Computer Science, pp. 75–84 (1975)

    Google Scholar 

  27. Verint. Packet intrusion detection. Personal communication (2013)

    Google Scholar 

  28. Weiner, P.: Linear pattern matching algorithm. In: Proc. 14 IEEE Symposium on Switching and Automata Theory, pp. 1–11 (1973)

    Google Scholar 

  29. Zhang, M., Zhang, Y., Hu, L.: A faster algorithm for matching a set of patterns with variable length don’t cares. Inform. Process. Letters 110, 216–220 (2010)

    Article  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Amir, A., Levy, A., Porat, E., Shalom, B.R. (2014). Dictionary Matching with One Gap. In: Kulikov, A.S., Kuznetsov, S.O., Pevzner, P. (eds) Combinatorial Pattern Matching. CPM 2014. Lecture Notes in Computer Science, vol 8486. Springer, Cham. https://doi.org/10.1007/978-3-319-07566-2_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-07566-2_2

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-07565-5

  • Online ISBN: 978-3-319-07566-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics