Exact Analysis of Horspool’s and Sunday’s Pattern Matching Algorithms with Probabilistic Arithmetic Automata
We define deterministic arithmetic automata (DAAs) and connect them to a framework called probabilistic arithmetic automata (PAAs) . We use DAAs and PAAs to compute the entire exact probability distribution (in contrast to, e.g., asymptotic expectation and variance) of the number \(X^p_\ell\) of text characters accessed by the Horspool or Sunday pattern matching algorithms when matching a fixed pattern p against a random text of length ℓ. The random text model can be quite general, from simple uniform models to higher-order Markov models or hidden Markov models (HMMs). We develop several alternative constructions with different state spaces of the automata, leading to alternative time and space complexities for the computations. To our knowledge, this is the first time that suffix-based pattern matching algorithms are analyzed exactly. We present (perhaps surprising) exemplary results on short patterns and moderate text lengths. Our results easily generalize to any search-window based pattern matching algorithm.
Unable to display preview. Download preview PDF.
- 1.Baeza-Yates, R.A., Gonnet, G.H., Régnier, M.: Analysis of Boyer-Moore-type string searching algorithms. In: SODA ’90: Proceedings of the first annual ACM-SIAM symposium on Discrete algorithms, pp. 328–343. SIAM, Philadelphia (1990)Google Scholar