Abstract
In pattern matching with pair correlation distance problem, the goal is to find all occurrences of a pattern P of length m, in a text T of length n, where the distance between them is less than a threshold k. For each text location i, the distance is defined as the number of different kinds of mismatched pairs (α,β), between P and T[i ...i + m]. We present an algorithm with running time of \(O\left(min\{\left|\Sigma_P\right|^2 n \log m,n \!\left({m \log m}\right)^\frac{2}{3}\}\right)\!\) for this problem. Another interesting problem is the one-side pair correlation distance where it is desired to find all occurrences of P where the number of mismatched characters in P is less than k. For this problem, we present an algorithm with running time of \(O\left(min\{\left|\Sigma_P\right| n \log m,n\right.\left.\sqrt{m \log m}\}\right)\).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Amir, A., Aumann, Y., Cole, R., Lewenstein, M., Porat, E.: Function matching: Algorithms, applications, and a lower bound. In: Baeten, J.C.M., Lenstra, J.K., Parrow, J., Woeginger, G.J. (eds.) ICALP 2003. LNCS, vol. 2719, pp. 929–942. Springer, Heidelberg (2003)
Amir, A., Farach, M., Muthukrishnan, S.: Alphabet dependence in parameterized matching. Information Processing Letters 49, 111–115 (1994)
Amir, A., Landau, G.M., Lewenstein, M., Lewenstein, N.: Efficient special cases of pattern matching with swaps. Information Processing Letters 68(3), 125–132 (1998)
Amir, A., Lewenstein, M., Porat, E.: Faster algorithms for string matching with k mismatches. In: Proc. 11th ACM-SIAM Symp. on Discrete Algorithms (SODA), pp. 794–803 (2000)
Baker, B.S.: A theory of parameterized pattern matching: algorithms and applications. In: Proc. 25th Annual ACM Symposium on the Theory of Computation, pp. 71–80 (1993)
Baker, B.S.: Parameterized pattern matching: Algorithms and applications. J. Comput. Syst. Sci. 52(1), 28–42 (1996)
Baker, B.S.: Parameterized duplication in strings: Algorithms and an application to software maintenance. SIAM J. Comput. 26(5), 1343–1362 (1997)
Hazay, C., Lewenstein, M., Sokol, D.: Approximate parameterized matching. In: Albers, S., Radzik, T. (eds.) ESA 2004, vol. 3221, pp. 414–425. Springer, Heidelberg (2004)
Levenstein, V.I.: Binary codes capable of correcting, deletions, insertions and reversals. Soviet Phys. Dokl. 10, 707–710 (1966)
Landau, G.M., Vishkin, U.: Introducing efficient parallelism into approximate string matching. In: Proc. 18th ACM Symposium on Theory of Computing, pp. 220–230 (1986)
Lowrance, R., Wagner, R.A.: An extension of the string-to-string correction problem. Journal of the ACM 22(2), 177–183 (1975)
Nosovskij, G.V.: Mathematical analysis of stock market movement. In: 3rd International Conference on Cyberworlds (CW), pp. 320–321 (2004)
Schleimer, S., Wilkerson, D.S., Winnowing, A.A.: Local algorithms for document fingerprinting. In: Proceedings of the International Conference on Management of Data (SIGMOD), pp. 76–85 (2003)
Shmulevich, I., Yli-Harja, O., Coyle, E., Povel, D., Lemstrom, K.: Perceptual issues in music pattern recognition - complexity of rhythm and key finding. In: Proc. of AISB Symposium on Musical Creativity, pp. 64–69 (1999)
Wagner, R.A.: On the complexity of the extended string-to-string correction problem. In: Proc. of the 7th ACM Symposium on the Theory of Computing (STOC), pp. 218–223 (1975)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Porat, B., Porat, E., Zur, A. (2008). Pattern Matching with Pair Correlation Distance. In: Amir, A., Turpin, A., Moffat, A. (eds) String Processing and Information Retrieval. SPIRE 2008. Lecture Notes in Computer Science, vol 5280. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-89097-3_24
Download citation
DOI: https://doi.org/10.1007/978-3-540-89097-3_24
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-89096-6
Online ISBN: 978-3-540-89097-3
eBook Packages: Computer ScienceComputer Science (R0)