Advertisement

Simple Optimal String Matching Algorithm

Extended Abstract
  • Cyril Allauzen
  • Mathieu Raffinot
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1848)

Abstract

We present a new string matching algorithm linear in the worst case (in O(m + n) where n is the size of the text and m the size of the searched word, both taken on an alphabet σ and optimal on average (with equiprobability and independence of letters, in O(m + n log|σ| m/m)). Of all the algorithms that verify these two complexities, our is the simplest since it uses only a single structure, a suffix automaton. Moreover, its preprocessing phase is linearly dynamical, i.e. it is possible to search the words p1, then p 1 p 2,p 1 p 2 p 3,..., p 1 p 2 p 3... p i with O(σ|pi|) total preprocessing time. Among the algorithms that verify this property (for instance the Knuth-Morris-Pratt) our algorithm is the only one to be optimal on average.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Allauzen and M. Raffinot. Simple optimal string matching. Technical Report 99-14, Institut Gaspard-Monge, Université de Marne-la-Vallée, 1999. Accepted for publication in Journal of Algorithms.Google Scholar
  2. 2.
    R. S. Boyer and J. S. Moore. A fast string searching algorithm. Commun. ACM, 20(10):762–772, 1977.CrossRefGoogle Scholar
  3. 3.
    M. Crochemore. Transducers and repetitions. Theor. Comput. Sci., 45(1):63–86, 1986.zbMATHCrossRefMathSciNetGoogle Scholar
  4. 4.
    M. Crochemore. Constant-space string-matching. In K. V. Nori and S. Kumar, editors, Proceedings of the 8th Conference on Foundations of Software Technology and Theoretical Computer Science, number 338 in Lecture Notes in Computer Science, pages 80–87. Springer-Verlag, Berlin, 1988.Google Scholar
  5. 5.
    M. Crochemore, L. Gasieniec, and W. Rytter. Constant-space string matching in sublinear average time. In B. Carpentieri, A. De Santis, U. Vaccaro, and J.A. Storer, editors, Compression and Complexity of Sequences, pages 230–239. IEEE Computer Society, 1998.Google Scholar
  6. 6.
    M. Crochemore and W. Rytter. Text algorithms. Oxford University Press, 1994.Google Scholar
  7. 7.
    A. Czumaj, M. Crochemore, L. Gasieniec, S. Jarominek, T. Lecroq, W. Plandowski, and W. Rytter. Speeding up two string-matching algorithms. Algorithmica, 12:247–267, 1994.zbMATHCrossRefMathSciNetGoogle Scholar
  8. 8.
    Z. Galil and J. Seiferas. Time-space optimal string matching. J. Comput. Syst. Sci., 26(3):280–294, 1983.CrossRefMathSciNetGoogle Scholar
  9. 9.
    L. Gçasieniec, W. Plandowski, and W. Rytter. Constant-space string matching with smaller number of comparisons: sequential sampling. In Z. Galil and E. Ukkonen, editors, Proceedings of the 6th Annual Symposium on Combinatorial Pattern Matching, number 937 in Lecture Notes in Computer Science, pages 78–89, Espoo, Finland, 1995. Springer-Verlag, Berlin.Google Scholar
  10. 10.
    D. E. Knuth, J. H. Morris, Jr, and V. R. Pratt. Fast pattern matching in strings. SIAM J. Comput., 6(1):323–350, 1977.zbMATHCrossRefMathSciNetGoogle Scholar
  11. 11.
    T. Lecroq. Recherches de mot Thèse de doctorat, Université d’Orléans, France, 1992.Google Scholar
  12. 12.
    A. C. Yao. The complexity of pattern matching for a random string. SIAM J. Comput., 8(3):368–387, 1979.zbMATHCrossRefMathSciNetGoogle Scholar
  13. 13.
    J. Ziv and A. Lempel. A universal algorithm for sequential data compression. IEEE Trans. Inf. Theory, 23:337–343, 1977.zbMATHCrossRefMathSciNetGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2000

Authors and Affiliations

  • Cyril Allauzen
    • 1
  • Mathieu Raffinot
    • 1
  1. 1.Institut Gaspard-MongeUniversité de Marne-la-Vallée, Cité Descartes Champs-sur-MarneMarne-la-Vallée Cedex 2France

Personalised recommendations