Skip to main content

A Very Fast String Matching Algorithm Based on Condensed Alphabets

  • Conference paper
  • First Online:
Algorithmic Aspects in Information and Management (AAIM 2016)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9778))

Included in the following conference series:

Abstract

String matching is the problem of finding all the substrings of a text which correspond to a given pattern. It’s one of the most investigated problem in computer science, mainly due to its various applications in many fields. In recent years most solutions to the problem focused on efficiency and flexibility of the searching procedure and effective techniques appeared to speed-up previous solutions. In this paper we present a simple and very efficient algorithm for string matching. It can be seen as an extension of the Skip-Search algorithm to condensed alphabets with the aim of reducing the number of verifications during the searching phase. From our experimental results it turns out that the new variant obtains in most cases the best running time when compared against the most effective algorithms in literature. This makes the new algorithm one of the most flexible solutions in practical cases.

This work has been supported by G.N.C.S., Istituto Nazionale di Alta Matematica “Francesco Severi”.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    The Smart tool is available online at http://www.dmi.unict.it/~faro/smart/.

References

  1. Allauzen, C., Crochemore, M., Raffinot, M.: Factor oracle: A new structure for pattern matching. In: Bartosek, M., Tel, G., Pavelka, J. (eds.) SOFSEM 1999. LNCS, vol. 1725, pp. 295–310. Springer, Heidelberg (1999). http://dx.doi.org/10.1007/3-540-47849-3_18

    Chapter  Google Scholar 

  2. Baeza-Yates, R., Gonnet, G.H.: A new approach to text searching. Commun. ACM 35(10), 74–82 (1992). http://doi.acm.org/10.1145/135239.135243

    Article  Google Scholar 

  3. Cantone, D., Faro, S.: Fast-search algorithms: New efficient variants of the boyer-moore pattern-matching algorithm. J. Automata Lang. Comb. 10(5/6), 589–608 (2005)

    MathSciNet  MATH  Google Scholar 

  4. Cantone, D., Faro, S.: Improved and self-tuned occurrence heuristics. In: Holub, J., Zdárek, J. (eds.) Proceedings of the Prague Stringology Conference 2013, Prague, Czech Republic, 2–4 September 2013, pp. 92–106. Department of Theoretical Computer Science, Faculty of Information Technology, Czech Technical University in Prague (2013). http://www.stringology.org/event/2013/p09.html

  5. Cantone, D., Faro, S., Giaquinta, E.: A compact representation of nondeterministic (suffix) automata for the bit-parallel approach. Inf. Comput. 213, 3–12 (2012). http://dx.doi.org/10.1016/j.ic.2011.03.006

    Article  MathSciNet  MATH  Google Scholar 

  6. Charras, C., Lecroq, T.: Handbook of Exact String Matching Algorithms. College Publications (2004)

    Google Scholar 

  7. Charras, C., Lecroq, T., Pehoushek, J.D.: A very fast string matching algorithm for small alphabeths and long patterns (extended abstract). In: Farach-Colton [11], pp. 55–64. http://dx.doi.org/10.1007/BFb0030780

  8. Crochemore, M., Czumaj, A., Gasieniec, L., Jarominek, S., Lecroq, T., Plandowski, W., Rytter, W.: Speeding up two string-matching algorithms. Algorithmica 12(4/5), 247–267 (1994). http://dx.doi.org/10.1007/BF01185427

    Article  MathSciNet  MATH  Google Scholar 

  9. Durian, B., Chhabra, T., Ghuman, S.S., Hirvola, T., Peltola, H., Tarhio, J.: Improved two-way bit-parallel search. In: Holub, J., Zdárek, J. (eds.) Proceedings of the Prague Stringology Conference 2014, Prague, Czech Republic, 1–3 September 2014, pp. 71–83. Department of Theoretical Computer Science, Faculty of Information Technology, Czech Technical University in Prague (2014)

    Google Scholar 

  10. Ďurian, B., Peltola, H., Salmela, L., Tarhio, J.: Bit-parallel search algorithms for long patterns. In: Festa, P. (ed.) SEA 2010. LNCS, vol. 6049, pp. 129–140. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  11. Farach-Colton, M. (ed.): CPM 1998. LNCS, vol. 1448. Springer, Heidelberg (1998)

    Google Scholar 

  12. Faro, S., Külekci, M.O.: Fast and flexible packed string matching. J. Discrete Algorithms 28, 61–72 (2014). http://dx.doi.org/10.1016/j.jda.2014.07.003

    Article  MathSciNet  MATH  Google Scholar 

  13. Faro, S., Lecroq, T.: Efficient variants of the backward-oracle-matching algorithm. In: Holub, J., Žďárek, J. (eds.) Proceedings of the Prague Stringology Conference 2008, pp. 146–160. Czech Technical University in Prague, Czech Republic (2008)

    Google Scholar 

  14. Faro, S., Lecroq, T.: The exact string matching problem: a comprehensive experimental evaluation. CoRR abs/1012.2547 (2010)

    Google Scholar 

  15. Faro, S., Lecroq, T.: A fast suffix automata based algorithm for exact online string matching. In: Moreira, N., Reis, R. (eds.) CIAA 2012. LNCS, vol. 7381, pp. 149–158. Springer, Heidelberg (2012). http://dx.doi.org/10.1007/978-3-642-31606-7_13

    Chapter  Google Scholar 

  16. Faro, S., Lecroq, T.: A multiple sliding windows approach to speed up string matching algorithms. In: Klasing, R. (ed.) SEA 2012. LNCS, vol. 7276, pp. 172–183. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  17. Faro, S., Lecroq, T.: The exact online string matching problem: A review of the most recent results. ACM Comput. Surv. 45(2), 13 (2013). http://doi.acm.org/10.1145/2431211.2431212

    Article  MATH  Google Scholar 

  18. Fredriksson, K., Grabowski, S.: Practical and optimal string matching. In: Consens, M.P., Navarro, G. (eds.) SPIRE 2005. LNCS, vol. 3772, pp. 376–387. Springer, Heidelberg (2005). http://dx.doi.org/10.1007/11575832_42

    Chapter  Google Scholar 

  19. Karp, R.M., Rabin, M.O.: Efficient randomized pattern-matching algorithms. IBM J. Res. Dev. 31(2), 249–260 (1987)

    Article  MathSciNet  MATH  Google Scholar 

  20. Knuth, D.E., Morris, J.H., Pratt, V.R.: Fast pattern matching in strings. SIAM J. Comput. 6(1), 323–350 (1977)

    Article  MathSciNet  MATH  Google Scholar 

  21. Lecroq, T.: Fast exact string matching algorithms. Inf. Process. Lett. 102(6), 229–235 (2007). http://dx.doi.org/10.1016/j.ipl.2007.01.002

    Article  MathSciNet  MATH  Google Scholar 

  22. Navarro, G., Raffinot, M.: A bit-parallel approach to suffix automata: Fast extended string matching. In: Farach-Colton [11], pp. 14–33. http://dx.doi.org/10.1007/BFb0030778

    Google Scholar 

  23. Yao, A.C.: The complexity of pattern matching for a random string. SIAM J. Comput. 8(3), 368–387 (1979). http://dx.doi.org/10.1137/0208029

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Simone Faro .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Faro, S. (2016). A Very Fast String Matching Algorithm Based on Condensed Alphabets. In: Dondi, R., Fertin, G., Mauri, G. (eds) Algorithmic Aspects in Information and Management. AAIM 2016. Lecture Notes in Computer Science(), vol 9778. Springer, Cham. https://doi.org/10.1007/978-3-319-41168-2_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-41168-2_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-41167-5

  • Online ISBN: 978-3-319-41168-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics