Abstract
Given a text \(\mathsf {T}\) having \(n\) characters, we consider the non-overlapping indexing problem defined as follows: pre-process \(\mathsf {T}\) into a data-structure, such that whenever a pattern \(P\) comes as input, we can report a maximal set of non-overlapping occurrences of \(P\) in \(\mathsf {T}\). The best known solution for this problem takes linear space, in which a suffix tree of \(\mathsf {T}\) is augmented with \(O(n)\)-word data structures. A query \(P\) can be answered in optimal \(O(|P|+\mathsf {nocc})\) time, where \(\mathsf {nocc}\) is the output size [Cohen and Porat, ISAAC 2009]. We present the following new result: let \(\mathsf {CSA}\) (not necessarily a compressed suffix array) be an index of \(\mathsf {T}\) that can compute (i) the suffix range of \(P\) in \(\mathsf {search}(P)\) time, and (ii) a suffix array or an inverse suffix array value in \(\mathsf {t_{SA}}\) time; then by using \(\mathsf {CSA}\) alone, we can answer a query \(P\) in \(O(\mathsf {search}(P)+ \mathsf {nocc}\cdot \mathsf {t_{SA}})\) time. Additionally, we present an improved result for a generalized version of this problem called range non-overlapping indexing.
This research is funded in part by National Science Foundation (NSF) Grant CCF 1218904.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Abouelhoda, M.I., Kurtz, S., Ohlebusch, E.: Replacing suffix trees with enhanced suffix arrays. J. Discret. Algorithms 2(1), 53–86 (2004)
Alstrup, S., Brodal, G.S., Rauhe, T.: New data structures for orthogonal range searching. In: 41st Annual Symposium on Foundations of Computer Science, FOCS 2000, 12–14 November 2000, Redondo Beach, California, USA, pp. 198–207 (2000)
Alstrup, S., Brodal, G.S., Rauhe, T.: Optimal static range reporting in one dimension. In: Proceedings on 33rd Annual ACM Symposium on Theory of Computing, 6–8 July 2001, Heraklion, Crete, Greece, pp. 476–482 (2001)
Belazzougui, D., Navarro, G.: Alphabet-independent compressed text indexing. ACM Trans. Algorithms 10(4), 23 (2014)
Boyer, R.S., Moore, J.S.: A fast string searching algorithm. Commun. ACM 20(10), 762–772 (1977)
Cohen, H., Porat, E.: Range non-overlapping indexing. In: Dong, Y., Du, D.-Z., Ibarra, O. (eds.) ISAAC 2009. LNCS, vol. 5878, pp. 1044–1053. Springer, Heidelberg (2009)
Crochemore, M., Iliopoulos, C.S., Kubica, M., Rahman, M.S., Walen, T.: Improved algorithms for the range next value problem and applications. In: STACS 2008, Proceeding of the 25th Annual Symposium on Theoretical Aspects of Computer Science, Bordeaux, France, 21–23 February 2008, pp. 205–216 (2008)
Ferragina, P., Manzini, G.: Indexing compressed text. J. ACM 52(4), 552–581 (2005)
Grossi, R., Vitter, J.S.: Compressed suffix arrays and suffix trees with applications to text indexing and string matching (extended abstract). In: Proceedings of the Thirty-Second Annual ACM Symposium on Theory of Computing, 21–23 May 2000, Portland, OR, USA, pp. 397–40 (2000)
Gusfield, D.: Algorithms on Strings, Trees, and Sequences : Computer Science and Computational Biology. Cambridge University Press, New York (1997)
Hon, W., Shah, R., Thankachan, S.V., Vitter, J.S.: On position restricted substring searching in succinct space. J. Discret. Algorithms 17, 109–114 (2012)
Karp, R.M., Rabin, M.O.: Efficient randomized pattern-matching algorithms. IBM J. Res. Dev. 31(2), 249–260 (1987)
Keller, O., Kopelowitz, T., Lewenstein, M.: Range non-overlapping indexing and successive list indexing. In: Dehne, F., Sack, J.-R., Zeh, N. (eds.) WADS 2007. LNCS, vol. 4619, pp. 625–636. Springer, Heidelberg (2007)
Knuth, D.E., Jr., J.H.M., Pratt, V.R.: Fast pattern matching in strings. SIAM J. Comput. 6(2), 323–350 (1977)
Mäkinen, V., Navarro, G.: Position-restricted substring searching. In: Correa, J.R., Hevia, A., Kiwi, M. (eds.) LATIN 2006. LNCS, vol. 3887, pp. 703–714. Springer, Heidelberg (2006)
Manber, U., Myers, E.W.: Suffix arrays: a new method for on-line string searches. SIAM J. Comput. 22, 935–948 (1993)
Navarro, G., Mäkinen, V.: Compressed full-text indexes. ACM Comput. Surv., vol. 39(1) (2007)
Nekrich, Y., Navarro, G.: Sorted range reporting. In: Fomin, F.V., Kaski, P. (eds.) SWAT 2012. LNCS, vol. 7357, pp. 271–282. Springer, Heidelberg (2012)
Ukkonen, E.: On-line construction of suffix trees. Algorithmica 14(3), 249–260 (1995)
Weiner, P.: Linear pattern matching algorithms. In: 14th Annual Symposium on Switching and Automata Theory, Iowa City, Iowa, USA, 15–17 October 1973, pp. 1–11 (1973)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Ganguly, A., Shah, R., Thankachan, S.V. (2015). Succinct Non-overlapping Indexing. In: Cicalese, F., Porat, E., Vaccaro, U. (eds) Combinatorial Pattern Matching. CPM 2015. Lecture Notes in Computer Science(), vol 9133. Springer, Cham. https://doi.org/10.1007/978-3-319-19929-0_16
Download citation
DOI: https://doi.org/10.1007/978-3-319-19929-0_16
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-19928-3
Online ISBN: 978-3-319-19929-0
eBook Packages: Computer ScienceComputer Science (R0)