Advertisement

BFM: a forward backward string matching algorithm with improved shifting for information retrieval

  • MD. Obaidullah Al-FarukEmail author
  • K. M. Akib Hussain
  • MD. Adnan Shahriar
  • Shakila Mahjabin Tonni
Original Research
  • 2 Downloads

Abstract

Mining data from text is often becomes a crucial part of data mining tasks. With the growing tendency of using cloud and sharing more and more files over the internet, the necessity of applying a string matching algorithm in text mining has increased rapidly in present time. These algorithms need to make less character comparisons and pattern shifts while searching. In this paper, we’re proposing a new algorithm named Back and Forth Matching (BFM) algorithm that works faster by matching a pattern from both the forward and backward direction. It shows a tremendous improvement, in comparison to other algorithms, while matching strings in large text files.

Keywords

Data mining Text search Text mining String matching Exact pattern matching Knuth–Morris–Pratt Boyer–Moore 

References

  1. 1.
    Gurung Dipendra, Chakraborty Udit Kr, Sharma Pratikshya (2016) Intelligent predictive string search algorithm. Proc Comput Sci 79:161–169CrossRefGoogle Scholar
  2. 2.
    Singla Nimisha, Garg Deepak (2012) String matching algorithms and their applicability in various applications. Int J Soft Comput Eng 1(6):218–222Google Scholar
  3. 3.
    Mahmood Al-Dabbagh SS, Sahib Naser MA, Barnouti NH (2017) Fast hybrid string matching algorithm based on the quick-skip and tuned Boyer-Moore algorithms. Int J Adv Comput Sci Appl 8(6):117–127Google Scholar
  4. 4.
    Raju S Viswanadha, Babu A Vinaya, Mrudula M (2006) Backend engine for parallel string matching using boolean matrix. In: International symposium on parallel computing in electrical engineering, 2006. PAR ELEC 2006, pp 281–283. IEEEGoogle Scholar
  5. 5.
    Charras C, Lecrog T, Pehoushek JD (1998) A very fast string matching algorithm for small alphabets and long patterns. In: Annual symposium on combinatorial pattern matching, Springer, New York, pp 55–64Google Scholar
  6. 6.
    Rasool Akhtar, Khare Nilay, Arora Himanshu, Varshney Amit, Kumar Gaurav (2012) Multithreaded implementation of hybrid string matching algorithm. Int J Comput Sci Eng 4(3):438Google Scholar
  7. 7.
    Bhandari Jamuna (2014) String matching rules used by variants of Boyer-Moore algorithm. J Global Res Comput Sci 5(1):8–11MathSciNetGoogle Scholar
  8. 8.
    Mahmood SS, Dabbagh A, Barnouti NH (2017) A new efficient hybrid string matching algorithm to solve the exact string matching problem. Br J Math Comput Sci 20(2):1–14 Google Scholar
  9. 9.
    Tarhio Jorma, Ukkonen Esko (1993) Approximate Boyer-Moore string matching. SIAM J Comput 22(2):243–260MathSciNetCrossRefGoogle Scholar
  10. 10.
    Sahli Mohammed, Shibuya Tetsuo (2012) Max-shift bm and max-shift horspool: practical fast exact string matching algorithms. J Inf Process 20(2):419–425Google Scholar
  11. 11.
    Klaib Ahmad Fadel, Zainol Zurinahni, Ahamed Nurul Hashimah, Ahmad Rosma, Hussin Wahidah (2007) Application of exact string matching algorithms towards smiles representation of chemical structure. Int J Comput Inf Sci Eng 1:235–239Google Scholar
  12. 12.
    Naser Mustafa Abdul Sahib, Aboalmaaly Mohammed Faiz et al (2012) Quick-skip search hybrid algorithm for the exact string matching problem. Int J Comput Theory Eng 4(2):259CrossRefGoogle Scholar
  13. 13.
    Al-Khamaiseh Koloud, ALShagarin Shadi (2014) A survey of string matching algorithms. Int J Eng Res Appl 4(7):144–156Google Scholar
  14. 14.
    Rao CS, Raju KB, Raju SV (2013) Parallel string matching with multi-core processors-a comparative study for gene sequences. Global J Comput Sci Technol 13(1):26–41Google Scholar
  15. 15.
    Tsarev RY, Chernigovskiy AS, Tsareva EA, Brezitskaya VV, Nikiforov AY, Smirnov NA (2016) Combined string searching algorithm based on Knuth-Morris-Pratt and Boyer-Moore algorithms. In: IOP conference series: materials science and engineering, vol 122, IOP Publishing, p 012034Google Scholar
  16. 16.
    Karp Richard M, Rabin Michael O (1987) Efficient randomized pattern-matching algorithms. IBM J Res Dev 31(2):249–260MathSciNetCrossRefGoogle Scholar
  17. 17.
    Shah P, Oza R (2017) Improved parallel Rabin-Karp algorithm using compute unified device architecture. In: International conference on information and communication technology for intelligent systems, Springer, New York, pp 236–244Google Scholar
  18. 18.
    Rahim Robbi, Zulkarnain Iskandar, Jaya Hendra (2017) A review: search visualization with Knuth Morris Pratt algorithm. In: IOP conference series: materials science and engineering, vol 237, IOP Publishing, p 012026Google Scholar
  19. 19.
    Abu-Zaid IM, El-Rayyes EK (2012) Parallel search using kmp algorithm in Arabic string. Int J Sci Technol 2(7):427–431Google Scholar
  20. 20.
    Janani R, Vijayarani S (2016) An efficient text pattern matching algorithm for retrieving information from desktop. Indian J Sci Technol 9(43):1CrossRefGoogle Scholar
  21. 21.
    Raita Timo (1992) Tuning the Boyer-Moore-Horspool string searching algorithm. Softw Pract Exp 22(10):879–884CrossRefGoogle Scholar
  22. 22.
    Powell M (2007) The canterbury corpus. http://corpus.canterbury.ac.nz/descriptions. Accessed 15 Feb 2018

Copyright information

© Bharati Vidyapeeth's Institute of Computer Applications and Management 2019

Authors and Affiliations

  1. 1.Department of Computer Science and EngineeringEast West UniversityDhakaBangladesh

Personalised recommendations