Advertisement

Period Recovery over the Hamming and Edit Distances

  • Amihood Amir
  • Mika AmitEmail author
  • Gad M. Landau
  • Dina Sokol
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9644)

Abstract

A string S of length n has period P of length p if \(S[i]=S[i+p]\) for all \(1 \le i \le n-p\) and \(n \ge 2p\). The shortest such substring, P, is called the period of S, and the string S is called periodic in P. In this paper we investigate the period recovery problem. Given a string S of length n, find the primitive period(s) P such that the distance between S and the string that is periodic in P is below a threshold \(\tau \). We consider the period recovery problem over both the Hamming distance and the edit distance. For the Hamming distance case, we present an \(O(n \log n)\) time algorithm, where \(\tau \) is given as \(\frac{n}{(2+\epsilon )p}\), for \(0 < \epsilon < 1\). For the edit distance case, \(\tau =\frac{n}{(4+\epsilon )p}\), and we provide an \(O(n^{4/ 3})\) time algorithm.

Keywords

Period recovery Approximate periodicity Hamming distance Edit distance 

References

  1. 1.
    Amir, A., Eisenberg, E., Levy, A., Porat, E., Shapira, N.: Cycle detection and correction. ACM Trans. Algorithms 9(1), 13:1–13:20 (2012)MathSciNetCrossRefzbMATHGoogle Scholar
  2. 2.
    Amit, M., Crochemore, M., Landau, G.M.: Locating all maximal approximate runs in a string. In: Fischer, J., Sanders, P. (eds.) CPM 2013. LNCS, vol. 7922, pp. 13–27. Springer, Heidelberg (2013)CrossRefGoogle Scholar
  3. 3.
    Bannai, H.., Inenaga, T.I.S., Nakashima, Y., Takeda, M., Tsuruta, K.: The “runs” theorem. CoRR, abs/1406.0263v4 (2014)Google Scholar
  4. 4.
    Brodal, G.S., Lyngsø, R.B., Östlin, A., Pedersen, C.N.S.: Solving the string statistics problem in time \({{\cal O}}(n \log n)\). In: Widmayer, P., Triguero, F., Morales, R., Hennessy, M., Eidenbenz, S., Conejo, R. (eds.) ICALP 2002. LNCS, vol. 2380, pp. 728–739. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  5. 5.
    Chan, T.M.: Persistent predecessor search and orthogonal point location on the word ram. ACM Trans. Algorithms (TALG) 9(3), 22 (2013)MathSciNetzbMATHGoogle Scholar
  6. 6.
    Chazelle, B.: A functional approach to data structures and its use in multidimensional searching. SIAM J. Comput. 17(3), 427–462 (1988)MathSciNetCrossRefzbMATHGoogle Scholar
  7. 7.
    Crochemore, M.: An optimal algorithm for computing the repetitions in a word. Inf. Process. Lett. 12(5), 244–250 (1981)MathSciNetCrossRefzbMATHGoogle Scholar
  8. 8.
    Crochemore, M., Hancart, C., Lecroq, T.: Algorithms on Strings, 392 p. Cambridge University Press, Cambridge (2007)Google Scholar
  9. 9.
    Crochemore, M., Iliopoulos, C., Kubica, M., Radoszewski, J., Rytter, W., Waleń, T.: Extracting powers and periods in a string from its runs structure. In: Chavez, E., Lonardi, S. (eds.) SPIRE 2010. LNCS, vol. 6393, pp. 258–269. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  10. 10.
    Fine, N.J., Wilf, H.S.: Uniqueness theorems for periodic functions. Proc. Am. Math. Soc. 16, 109–114 (1965)MathSciNetCrossRefzbMATHGoogle Scholar
  11. 11.
    Fischetti, V.A., Landau, G.M., Sellers, P.H., Schmidt, J.P.: Identifying periodic occurences of a template with applications to protein structure. Inf. Process. Lett. 45(1), 11–18 (1993)MathSciNetCrossRefzbMATHGoogle Scholar
  12. 12.
    Galil, Z., Giancarlo, R.: Improved string matching with \(k\) mismatches. SIGACT News 17(4), 52–54 (1986)CrossRefzbMATHGoogle Scholar
  13. 13.
    Gusfield, D., Stoye, J.: Linear time algorithms for finding and representing all the tandem repeats in a string. J. Comput. Syst. Sci. 69(4), 525–546 (2004)MathSciNetCrossRefzbMATHGoogle Scholar
  14. 14.
    Iliopoulos, C.S., Moore, D., Smyth, W.F.: A characterization of the squares in a Fibonacci string. Theor. Comput. Sci. 172(1–2), 281–291 (1997)MathSciNetCrossRefzbMATHGoogle Scholar
  15. 15.
    Karp, R.M., Miller, R.E., Rosenberg, A.L.: Rapid identification of repeated patterns in strings, trees, and arrays. In: STOC: ACM Symposium on Theory of Computing (STOC) (1972)Google Scholar
  16. 16.
    Kolpakov, R.M., Kucherov, G.: Finding maximal repetitions in a word in linear time. In: Proceedings of Symposium on Foundations of Computer Science (FOCS), pp. 596–604 (1999)Google Scholar
  17. 17.
    Kolpakov, R.M., Kucherov, G.: Finding approximate repetitions under Hamming distance. Theor. Comput. Sci 1(303), 135–156 (2003)MathSciNetCrossRefzbMATHGoogle Scholar
  18. 18.
    Landau, G.M., Schmidt, J.P., Sokol, D.: An algorithm for approximate tandem repeats. J. Comput. Biol. 8(1), 1–18 (2001)CrossRefGoogle Scholar
  19. 19.
    Landau, G.M., Vishkin, U.: Fast parallel and serial approximate string matching. J. Algorithms 10(2), 157–169 (1989)MathSciNetCrossRefzbMATHGoogle Scholar
  20. 20.
    Lothaire, M.: Applied Combinatorics on Words (Encyclopedia of Mathematics and its Applications). Cambridge University Press, New York (2005)CrossRefzbMATHGoogle Scholar
  21. 21.
    Lyndon, R.C.: On Burnside’s problem. Trans. Am. Math. Soc. 77(2), 202–215 (1954)MathSciNetzbMATHGoogle Scholar
  22. 22.
    Myers, E.W., Miller, W.: Approximate matching of regular expressions. Bull. Math. Biol. 51(1), 5–37 (1989)MathSciNetCrossRefzbMATHGoogle Scholar
  23. 23.
    Sim, J.S., Iliopoulos, C.S., Park, K., Smyth, W.F.: Approximate periods of strings. In: Crochemore, M., Paterson, M. (eds.) CPM 1999. LNCS, vol. 1645, pp. 123–133. Springer, Heidelberg (1999)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2016

Authors and Affiliations

  • Amihood Amir
    • 1
    • 2
  • Mika Amit
    • 3
    Email author
  • Gad M. Landau
    • 3
    • 4
  • Dina Sokol
    • 5
  1. 1.Department of Mathematics and Computer ScienceBar-Ilan UniversityRamat GanIsrael
  2. 2.College of ComputingGeorgia TechAtlantaUSA
  3. 3.Department of Computer ScienceUniversity of HaifaHaifaIsrael
  4. 4.Department of Computer Science and EngineeringNYU Polytechnic School of Engineering, New York UniversityBrooklynUSA
  5. 5.Department of Computer and Information ScienceBrooklyn College of the City University of New YorkBrooklynUSA

Personalised recommendations