Skip to main content

Complexity of Sequential Pattern Matching Algorithms

  • Conference paper
  • First Online:
Randomization and Approximation Techniques in Computer Science (RANDOM 1998)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1518))

Abstract

We formally define a class of sequential pattern matching algorithms that includes all variations of Morris-Pratt algorithm. For the last twenty years it was known that the complexity of such algorithms is bounded by a linear function of the text length. Recently, substantial progress has been made in identifying lower bounds. We now prove there exists asymptotically a linearity constant for the worst and the average cases. We use Subadditive Ergodic Theorem and prove an almost sure convergence. Our results hold for any given pattern and text and for stationary ergodic pattern and text. In the course of the proof, we establish some structural property, namely, the existence of “unavoidable positions” where the algorithm must stop to compare. This property seems to be uniquely reserved for Morris-Pratt type algorithms (e.g., Boyer and Moore algorithm does not possess this property).

The project was supported by NATO Collaborative Grant CRG.950060, the ESPRIT III Program No. 7141 ALCOM II, and NSF Grants NCR-9415491, NCR-9804760.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. A. Apostolico and R. Giancarlo, The Boyer-Moore-Galil String Searching Strategies Revisited, SIAM J. Compt., 15, 98–105, 1986.

    Article  MATH  MathSciNet  Google Scholar 

  2. R. Baeza-Yates and M. Régnier, Average Running Time of Boyer-Moore-Horspool Algorithm, Theoretical Computer Science, 92, 19–31, 1992.

    Article  MATH  MathSciNet  Google Scholar 

  3. P. Billingsley, Convergence of Probability Measures, John Wiley & Sons, New York, 1968.

    MATH  Google Scholar 

  4. A. Blumer, A. Ehrenfeucht and D. Haussler, Average Size of Suffix Trees and DAWGS, Discrete Applied Mathematics, 24, 37–45, 1989.

    Article  MATH  MathSciNet  Google Scholar 

  5. R. Boyer and J. Moore, A fast String Searching Algorithm, Comm. of the ACM, 20, 762–772, 1977.

    Article  Google Scholar 

  6. D. Breslauer, L. Colussi, and L. Toniolo, Tight Comparison Bounds for the String Prefix-Matching Problem, Proc. 4-th Symposium on Combinatorial Pattern Matching, Padova, Italy, 11–19. Springer-Verlag, 1993.

    Google Scholar 

  7. R. Cole, R. Hariharan, M. Paterson, and U. Zwick, Tighter Lower Bounds on the Exact Complexity of String Matching, SIAM J. Comp., 24, 30–45, 1995.

    Article  MATH  MathSciNet  Google Scholar 

  8. L. Colussi, Z. Galil, and R. Giancarlo, On the Exact Complexity of String Matching, Proc. 31-st Annual IEEE Symposium on the Foundations of Computer Science, 135–143. IEEE, 1990.

    Google Scholar 

  9. M. Crochemore and W. Rytter, Text Algorithms, Oxford University Press, New York 1995.

    Google Scholar 

  10. Y. Derriennic, Une Théoréme Ergodique Presque Sous Additif, Ann. Probab., 11, 669–677, 1983.

    Article  MATH  MathSciNet  Google Scholar 

  11. R. Durrett, Probability: Theory and Examples, Wadsworth & Brooks/Cole Books, Pacific Grove, California, 1991.

    MATH  Google Scholar 

  12. L. Guibas and A. Odlyzko, A New Proof of the Linearity of the Boyer-Moore String Matching Algorithm, SIAM J. Compt., 9, 672–682, 1980.

    Article  MATH  MathSciNet  Google Scholar 

  13. C. Hancart, Analyse Exacte et en Moyenne d’Algorithmes de Recherche d’un Motif dans un Texte, These, l’Universite Paris 7, 1993.

    Google Scholar 

  14. P. Jacquet and W. Szpankowski, Autocorrelation on Words and Its Applications. Analysis of Suffix Tree by String-Ruler Approach, J. Combinatorial Theory. Ser. A, 66, 237–269, 1994.

    Article  MATH  MathSciNet  Google Scholar 

  15. J.F.C. Kingman, Subadditive Processes, in Ecole d’Eté de Probabilités de Saint-Flour V-1975, Lecture Notes in Mathematics, 539, Springer-Verlag, Berlin 1976.

    Google Scholar 

  16. D.E. Knuth, J. Morris and V. Pratt, Fast Pattern Matching in Strings, SIAM J. Compt., 6, 189–195, 1977.

    MathSciNet  Google Scholar 

  17. H. Mahmoud, M. Régnier and R. Smythe, Analysis of Boyer-Moore-Horspool String Matching Heuristic, in Random Structures and Algorithms, 10, 169–186, 1996.

    Article  Google Scholar 

  18. M. Régnier, Knuth-Morris-Pratt Algorithm: An Analysis, Proc. Mathematical Foundations for Computer Science 89, Porubka, Poland, Lecture Notes in Computer Science, 379, 431–444. Springer-Verlag, 1989.

    Google Scholar 

  19. I. Simon, String Matching Algorithms and Automata, First South-American Work-shop on String Processing 93, Belo Horizonte, Brazil, R. Baeza-Yates and N. Ziviani, ed, 151–157, 1993.

    Google Scholar 

  20. W. Szpankowski, Asymptotic Properties of Data Compression and Suffix Trees, IEEE Trans. Information Theory, 39, 1647–1659, 1993.

    Article  MATH  MathSciNet  Google Scholar 

  21. M. Waterman, Introduction to Computational Biology, Chapman & Hall, London 1995.

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1998 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Régnier, M., Szpankowski, W. (1998). Complexity of Sequential Pattern Matching Algorithms. In: Luby, M., Rolim, J.D.P., Serna, M. (eds) Randomization and Approximation Techniques in Computer Science. RANDOM 1998. Lecture Notes in Computer Science, vol 1518. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-49543-6_16

Download citation

  • DOI: https://doi.org/10.1007/3-540-49543-6_16

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-65142-0

  • Online ISBN: 978-3-540-49543-7

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics