Advertisement

New Bit-Parallel Indel-Distance Algorithm

  • Heikki Hyyrö
  • Yoan Pinzon
  • Ayumi Shinohara
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3503)

Abstract

The task of approximate string matching is to find all locations at which a pattern string p of length m matches a substring of a text string t of length n with at most k differences. It is common to use Levenshtein distance [5], which allows the differences to be single-character insertions, deletions, substitutions. Recently, in [3], the IndelMYE, IndelWM and IndelBYN algorithms where introduced as modified version of the bit-parallel algorithms of Myers [6], Wu&Manber [10] and Baeza-Yates&Navarro [1], respectively. These modified versions where made to support the indel distance (only single-character insertions and/or deletions are allowed). In this paper we present an improved version of IndelMYE that makes a better use of the bit-operations and runs 24.5 percent faster in practice. In the end we present a complete set of experimental results to support our findings.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Baeza-Yates, R., Navarro, G.: Faster approximate string matching. Algorithmica 23(2), 127–158 (1999)zbMATHCrossRefMathSciNetGoogle Scholar
  2. 2.
    Hyyrö, H., Fredriksson, K., Navarro, G.: Increased bit-parallelism for approximate string matching. In: Ribeiro, C.C., Martins, S.L. (eds.) WEA 2004. LNCS, vol. 3059, pp. 285–298. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  3. 3.
    Hyyrö, H., Pinzon, Y., Shinohara, A.: Fast bit-vector algorithms for approximate string matching under indel distance. In: Vojtáš, P., Bieliková, M., Charron-Bost, B., Sýkora, O. (eds.) SOFSEM 2005. LNCS, vol. 3381, pp. 380–384. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  4. 4.
    Hyyrö, H.: Explaining and extending the bit-parallel approximate string matching algorithm of Myers. Technical Report A-2001-10, Dept. of Computer and Information Sciences, University of Tampere, Tampere, Finland (2001)Google Scholar
  5. 5.
    Levenshtein, V.I.: Binary codes capable of correcting spurious insertions and deletions of ones (original in Russian). Russian Problemy Peredachi Informatsii 1, 12–25 (1965)Google Scholar
  6. 6.
    Myers, G.: A fast bit-vector algorithm for approximate string matching based on dynamic progamming. Journal of the ACM 46(3), 395–415 (1999)zbMATHCrossRefMathSciNetGoogle Scholar
  7. 7.
    Navarro, G.: A guided tour to approximate string matching. ACM Computing Surveys 33(1), 31–88 (2001)CrossRefGoogle Scholar
  8. 8.
    Ukkonen, E.: Finding approximate patterns in strings. Journal of Algorithms 6, 132–137 (1985)zbMATHCrossRefMathSciNetGoogle Scholar
  9. 9.
    Wright, A.: Approximate string matching using within-word parallelism. Software Practice and Experience 24(4), 337–362 (1994)zbMATHCrossRefGoogle Scholar
  10. 10.
    Wu, S., Manber, U.: Fast text searching allowing errors. Comm. of the ACM 35(10), 83–91 (1992)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2005

Authors and Affiliations

  • Heikki Hyyrö
    • 1
  • Yoan Pinzon
    • 2
  • Ayumi Shinohara
    • 1
    • 3
  1. 1.PRESTOJapan Science and Technology Agency (JST)Japan
  2. 2.Department of Computer ScienceKing’s CollegeLondonUK
  3. 3.Department of InformaticsKyushu University 33FukuokaJapan

Personalised recommendations