Abstract
String matching is the problem of finding all the substrings of a text which correspond to a given pattern. It’s one of the most investigated problem in computer science, mainly due to its various applications in many fields. In recent years most solutions to the problem focused on efficiency and flexibility of the searching procedure and effective techniques appeared to speed-up previous solutions. In this paper we present a simple and very efficient algorithm for string matching. It can be seen as an extension of the Skip-Search algorithm to condensed alphabets with the aim of reducing the number of verifications during the searching phase. From our experimental results it turns out that the new variant obtains in most cases the best running time when compared against the most effective algorithms in literature. This makes the new algorithm one of the most flexible solutions in practical cases.
This work has been supported by G.N.C.S., Istituto Nazionale di Alta Matematica “Francesco Severi”.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
The Smart tool is available online at http://www.dmi.unict.it/~faro/smart/.
References
Allauzen, C., Crochemore, M., Raffinot, M.: Factor oracle: A new structure for pattern matching. In: Bartosek, M., Tel, G., Pavelka, J. (eds.) SOFSEM 1999. LNCS, vol. 1725, pp. 295–310. Springer, Heidelberg (1999). http://dx.doi.org/10.1007/3-540-47849-3_18
Baeza-Yates, R., Gonnet, G.H.: A new approach to text searching. Commun. ACM 35(10), 74–82 (1992). http://doi.acm.org/10.1145/135239.135243
Cantone, D., Faro, S.: Fast-search algorithms: New efficient variants of the boyer-moore pattern-matching algorithm. J. Automata Lang. Comb. 10(5/6), 589–608 (2005)
Cantone, D., Faro, S.: Improved and self-tuned occurrence heuristics. In: Holub, J., Zdárek, J. (eds.) Proceedings of the Prague Stringology Conference 2013, Prague, Czech Republic, 2–4 September 2013, pp. 92–106. Department of Theoretical Computer Science, Faculty of Information Technology, Czech Technical University in Prague (2013). http://www.stringology.org/event/2013/p09.html
Cantone, D., Faro, S., Giaquinta, E.: A compact representation of nondeterministic (suffix) automata for the bit-parallel approach. Inf. Comput. 213, 3–12 (2012). http://dx.doi.org/10.1016/j.ic.2011.03.006
Charras, C., Lecroq, T.: Handbook of Exact String Matching Algorithms. College Publications (2004)
Charras, C., Lecroq, T., Pehoushek, J.D.: A very fast string matching algorithm for small alphabeths and long patterns (extended abstract). In: Farach-Colton [11], pp. 55–64. http://dx.doi.org/10.1007/BFb0030780
Crochemore, M., Czumaj, A., Gasieniec, L., Jarominek, S., Lecroq, T., Plandowski, W., Rytter, W.: Speeding up two string-matching algorithms. Algorithmica 12(4/5), 247–267 (1994). http://dx.doi.org/10.1007/BF01185427
Durian, B., Chhabra, T., Ghuman, S.S., Hirvola, T., Peltola, H., Tarhio, J.: Improved two-way bit-parallel search. In: Holub, J., Zdárek, J. (eds.) Proceedings of the Prague Stringology Conference 2014, Prague, Czech Republic, 1–3 September 2014, pp. 71–83. Department of Theoretical Computer Science, Faculty of Information Technology, Czech Technical University in Prague (2014)
Ďurian, B., Peltola, H., Salmela, L., Tarhio, J.: Bit-parallel search algorithms for long patterns. In: Festa, P. (ed.) SEA 2010. LNCS, vol. 6049, pp. 129–140. Springer, Heidelberg (2010)
Farach-Colton, M. (ed.): CPM 1998. LNCS, vol. 1448. Springer, Heidelberg (1998)
Faro, S., Külekci, M.O.: Fast and flexible packed string matching. J. Discrete Algorithms 28, 61–72 (2014). http://dx.doi.org/10.1016/j.jda.2014.07.003
Faro, S., Lecroq, T.: Efficient variants of the backward-oracle-matching algorithm. In: Holub, J., Žďárek, J. (eds.) Proceedings of the Prague Stringology Conference 2008, pp. 146–160. Czech Technical University in Prague, Czech Republic (2008)
Faro, S., Lecroq, T.: The exact string matching problem: a comprehensive experimental evaluation. CoRR abs/1012.2547 (2010)
Faro, S., Lecroq, T.: A fast suffix automata based algorithm for exact online string matching. In: Moreira, N., Reis, R. (eds.) CIAA 2012. LNCS, vol. 7381, pp. 149–158. Springer, Heidelberg (2012). http://dx.doi.org/10.1007/978-3-642-31606-7_13
Faro, S., Lecroq, T.: A multiple sliding windows approach to speed up string matching algorithms. In: Klasing, R. (ed.) SEA 2012. LNCS, vol. 7276, pp. 172–183. Springer, Heidelberg (2012)
Faro, S., Lecroq, T.: The exact online string matching problem: A review of the most recent results. ACM Comput. Surv. 45(2), 13 (2013). http://doi.acm.org/10.1145/2431211.2431212
Fredriksson, K., Grabowski, S.: Practical and optimal string matching. In: Consens, M.P., Navarro, G. (eds.) SPIRE 2005. LNCS, vol. 3772, pp. 376–387. Springer, Heidelberg (2005). http://dx.doi.org/10.1007/11575832_42
Karp, R.M., Rabin, M.O.: Efficient randomized pattern-matching algorithms. IBM J. Res. Dev. 31(2), 249–260 (1987)
Knuth, D.E., Morris, J.H., Pratt, V.R.: Fast pattern matching in strings. SIAM J. Comput. 6(1), 323–350 (1977)
Lecroq, T.: Fast exact string matching algorithms. Inf. Process. Lett. 102(6), 229–235 (2007). http://dx.doi.org/10.1016/j.ipl.2007.01.002
Navarro, G., Raffinot, M.: A bit-parallel approach to suffix automata: Fast extended string matching. In: Farach-Colton [11], pp. 14–33. http://dx.doi.org/10.1007/BFb0030778
Yao, A.C.: The complexity of pattern matching for a random string. SIAM J. Comput. 8(3), 368–387 (1979). http://dx.doi.org/10.1137/0208029
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Faro, S. (2016). A Very Fast String Matching Algorithm Based on Condensed Alphabets. In: Dondi, R., Fertin, G., Mauri, G. (eds) Algorithmic Aspects in Information and Management. AAIM 2016. Lecture Notes in Computer Science(), vol 9778. Springer, Cham. https://doi.org/10.1007/978-3-319-41168-2_6
Download citation
DOI: https://doi.org/10.1007/978-3-319-41168-2_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-41167-5
Online ISBN: 978-3-319-41168-2
eBook Packages: Computer ScienceComputer Science (R0)