Advertisement

Dynamic Fully-Compressed Suffix Trees

  • Luís M. S. Russo
  • Gonzalo Navarro
  • Arlindo L. Oliveira
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5029)

Abstract

Suffix trees are by far the most important data structure in stringology, with myriads of applications in fields like bioinformatics, data compression and information retrieval. Classical representations of suffix trees require O(n logn) bits of space, for a string of size n. This is considerably more than the n log2 σ bits needed for the string itself, where σ is the alphabet size. The size of suffix trees has been a barrier to their wider adoption in practice. A recent so-called fully-compressed suffix tree (FCST) requires asymptotically only the space of the text entropy. FCSTs, however, have the disadvantage of being static, not supporting updates to the text. In this paper we show how to support dynamic FCSTs within the same optimal space of the static version and executing all the operations in polylogarithmic time. In particular, we are able to build the suffix tree within optimal space.

Keywords

Time Complexity Binary Search Suffix Tree Optimal Space Dynamic Scenario 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Apostolico, A.: Combinatorial Algorithms on Words. In: The myriad virtues of subword trees. NATO ISI Series, pp. 85–96. Springer, Heidelberg (1985)Google Scholar
  2. 2.
    Gusfield, D.: Algorithms on Strings, Trees and Sequences. Cambridge University Press, Cambridge, UK (1997)zbMATHGoogle Scholar
  3. 3.
    Giegerich, R., Kurtz, S., Stoye, J.: Efficient implementation of lazy suffix trees. Softw. Pract. Exper. 33(11), 1035–1049 (2003)CrossRefGoogle Scholar
  4. 4.
    Manber, U., Myers, E.W.: Suffix arrays: A new method for on-line string searches. SIAM J. Comput. 22(5), 935–948 (1993)zbMATHCrossRefMathSciNetGoogle Scholar
  5. 5.
    Sadakane, K.: Compressed suffix trees with full functionality. Theory Comput. Syst. 41, 589–607 (2007), http://dx.doi.org/10.1007/s00224-006-1198-x
  6. 6.
    Navarro, G., Mäkinen, V.: Compressed full-text indexes. ACM Comp. Surv. 39(1), 2 (2007)CrossRefGoogle Scholar
  7. 7.
    Ferragina, P., Manzini, G., Mäkinen, V., Navarro, G.: Compressed representations of sequences and full-text indexes. ACM Trans. Algor. 3(2), 20 (2007)CrossRefGoogle Scholar
  8. 8.
    Manzini, G.: An analysis of the Burrows-Wheeler transform. J. ACM 48(3), 407–430 (2001)CrossRefMathSciNetGoogle Scholar
  9. 9.
    Russo, L., Navarro, G., Oliveira, A.: Fully-Compressed Suffix Trees. In: LATIN. LNCS, vol. 4957, pp. 362–373. Springer, Heidelberg (2008)Google Scholar
  10. 10.
    Chan, H.-L., Hon, W.-K., Lam, T.-W., Sadakane, K.: Compressed indexes for dynamic text collections. ACM Trans. Algorithms 3(2) (2007)Google Scholar
  11. 11.
    Mäkinen, V., Navarro, G.: Dynamic entropy-compressed sequences and full-text indexes. In: Lewenstein, M., Valiente, G. (eds.) CPM 2006. LNCS, vol. 4009, pp. 307–318. Springer, Heidelberg (to appear in ACM TALG, 2006)Google Scholar
  12. 12.
    González, R., Navarro, G.: Improved dynamic rank-select entropy-bound structures. In: LATIN. LNCS, vol. 4957, pp. 374–386. Springer, Heidelberg (2008)Google Scholar
  13. 13.
    Weiner, P.: Linear pattern matching algorithms. In: IEEE Symp. on Switching and Automata Theory, pp. 1–11 (1973)Google Scholar
  14. 14.
    Russo, L., Oliveira, A.: A compressed self-index using a Ziv-Lempel dictionary. In: Crestani, F., Ferragina, P., Sanderson, M. (eds.) SPIRE 2006. LNCS, vol. 4209, pp. 163–180. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  15. 15.
    Huynh, T.N.D., Hon, W.-K., Lam, T.W., Sung, W.-K.: Approximate string matching using compressed suffix arrays. Theor. Comput. Sci. 352(1-3), 240–249 (2006)zbMATHCrossRefMathSciNetGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Luís M. S. Russo
    • 1
    • 3
  • Gonzalo Navarro
    • 2
  • Arlindo L. Oliveira
    • 1
  1. 1.INESC-ID / ISTLisboaPortugal
  2. 2.Dept. of Computer ScienceUniversity of Chile 
  3. 3.Dept. of Computer ScienceUniversity of LisbonPortugal

Personalised recommendations