Advertisement

Abstract

In Indexing with Gaps one seeks to index a text to allow pattern queries that allow gaps within the pattern query. Formally a gapped-pattern over alphabet Σ is a pattern of the form p = p 1 g 1 p 2 g 2 ⋯ g p ℓ + 1, where ∀ i, p i  ∈ Σ* and each g i is a gap length ∈ N. Often one considers these patterns with some bound constraints, for example, all gaps are bounded by a gap-bound G.

Near-optimal solutions have, lately, been proposed for the case of one gap only with a predetermined size. More specifically, an indexing solution for patterns of the form p 1·g·p 2, where g is known apriori. In this case the solutions mentioned are preprocessed in O(n log ε n) time and O(n) space, where the pattern queries are answered in O(|p 1| + |p 2|), for constant sized alphabets. For the more general case when there is a bound G these results can be easily adapted with a multiplicative factor of O(G) for the preprocessing, i.e. O(n log ε nG) preprocessing time and O(nG) preprocessing space. Alas, these solutions do not lend to more than one gap.

In this paper we propose a solution for k gaps one with preprocessing time O(nG 2k log k n loglogn) and space of O(nG 2k log k n) and query time O(m + 2 k loglogn), where m = ∑  i = 1 |p i |.

Keywords

Pattern Match Query Time String Match Pattern Query Lower Common Ancestor 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Amir, A., Keselman, D., Landau, G., Lewenstein, N., Lewenstein, M., Rodeh, M.: Text indexing and dictionary matching with one error. J. of Algorithms 37(2), 309–325 (2000)MathSciNetCrossRefzbMATHGoogle Scholar
  2. 2.
    Amir, A., Landau, G., Lewenstein, M., Sokol, D.: Dynamic pattern, static text matching. ACM Transactions on Algorithms 3(2) (2007)Google Scholar
  3. 3.
    Bille, P., Gørtz, I.L.: Substring range reporting. In: Giancarlo, R., Manzini, G. (eds.) CPM 2011. LNCS, vol. 6661, pp. 299–308. Springer, Heidelberg (to apppear, 2011)CrossRefGoogle Scholar
  4. 4.
    Bille, P., Li Gørtz, I., Vildhøj, H.W., Wind, D.K.: String matching with variable length gaps. In: Chavez, E., Lonardi, S. (eds.) SPIRE 2010. LNCS, vol. 6393, pp. 385–394. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  5. 5.
    Clifford, P., Clifford, R.: Self-normalised distance with don’t cares. In: Ma, B., Zhang, K. (eds.) CPM 2007. LNCS, vol. 4580, pp. 63–70. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  6. 6.
    Cole, R., Gottlieb, L., Lewenstein, M.: Dictionary matching and indexing with errors and don’t cares. In: Proceedings of the Symposium On Theory of Computing (STOC), pp. 91–100 (2004)Google Scholar
  7. 7.
    Cole, R., Hariharan, R.: Verifying candidate matches in sparse and wildcard matching. In: Proceedings of the Symposium On Theory of Computing (STOC), pp. 592–601 (2002)Google Scholar
  8. 8.
    Crochemore, M., Iliopoulos, C., Makris, C., Rytter, W., Tsakalidis, A., Tsichlas, K.: Approximate string matching with gaps. Nordic J. of Computing 9(1), 54–65 (2002)MathSciNetzbMATHGoogle Scholar
  9. 9.
    Ferragina, P., Muthukrishnan, S., de Berg, M.: Multi-method dispatching: A geometric approach with applications to string matching problems. In: Proceedings of the Symposium on Theory of Computing (STOC), pp. 483–491 (1999)Google Scholar
  10. 10.
    Fischer, M., Paterson, M.: String matching and other products. In: Karp, R.M. (ed.) Complexity of Computation, SIAM-AMS Proceedings, vol. 7, pp. 113–125 (1974)Google Scholar
  11. 11.
    Harel, D., Tarjan, R.: Fast algorithms for finding nearest common ancestors. SIAM Journal on Computing 13(2), 338–355 (1984)MathSciNetCrossRefzbMATHGoogle Scholar
  12. 12.
    Iliopoulos, C., Rahman, M.: Indexing factors with gaps. Algorithmica 55(1), 60–70 (2008)MathSciNetCrossRefzbMATHGoogle Scholar
  13. 13.
    Indyk, P.: Faster algorithms for string matching problems: Matching the convolution bound. In: Proceedings of the Symposium on Foundations of Computer Science (FOCS), pp. 166–173 (1998)Google Scholar
  14. 14.
    Kalai, A.: Efficient pattern matching with don’t cares. In: Proceedings of the Symposium on Discrete Algorithms (SODA), pp. 655–656 (2002)Google Scholar
  15. 15.
    Lam, T.-W., Sung, W.-K., Tam, S.-L., Yiu, S.-M.: Space efficient indexes for string matching with don’t cares. In: Tokuyama, T. (ed.) ISAAC 2007. LNCS, vol. 4835, pp. 846–857. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  16. 16.
    Farach-Colton, S.M.M., Ferragina, P.: On the sorting-complexity of suffix tree construction. J. ACM 47(1), 987–1011 (2000)MathSciNetCrossRefzbMATHGoogle Scholar
  17. 17.
    McCreight, E.M.: A space-economical suffix tree construction algorithm. J. ACM 23(2), 262–272 (1976)MathSciNetCrossRefzbMATHGoogle Scholar
  18. 18.
    Peterlongo, M.S.P., Allali, J.: Indexing gapped-factors using a tree. Int. J. Found. Comput. Sci. 19(1), 71–87 (2008)MathSciNetCrossRefzbMATHGoogle Scholar
  19. 19.
    Schieber, B., Vishkin, U.: On finding lowest common ancestors: simplifications and parallelization. SIAM Journal on Computing 17(6), 1253–1262 (1988)MathSciNetCrossRefzbMATHGoogle Scholar
  20. 20.
    Ukkonen, E.: On-line construction of suffix trees. Algorithmica 14(3), 249–260 (1995)MathSciNetCrossRefzbMATHGoogle Scholar
  21. 21.
    Weiner, P.: Linear pattern matching algorithms. In: 14th Annual Symposium on Switching and Automata Theory, pp. 1–11. IEEE, Los Alamitos (1973)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Moshe Lewenstein
    • 1
  1. 1.Department of Computer ScienceBar-Ilan UniversityRamat-GanIsrael

Personalised recommendations