Abstract
We investigate the problem of maintaining the arc labels in the suffix tree data structure [15] when it undergoes string insertions and deletions. In current literature, this problem is solved either by a simple accounting strategy to obtain amortized bounds [10, 18] or by a periodical suffix tree reconstruction to obtain worst-case bounds (according to the global rebuilding technique in [20]). Unfortunately, the former approach is simple and space-efficient at the cost of attaining amortized bounds for the single update; the latter is space-consuming in practice because it needs to keep two extra suffix tree copies. In this paper, we obtain a surprisingly simple real-time algorithm that achieves worst-case bounds and only requires small additional space (i.e., a bi-directional pointer per suffix tree arc). We analyze the problem by introducing a combinatorial coloring problem on the suffix tree arcs.
Supported in part by MURST of Italy.
Preview
Unable to display preview. Download preview PDF.
References
Aho, A. V., and Corasick, M. J. Efficient string matching: an aid to bibliographic search. Communication of the ACM 18 (1975), 333–340.
Amir, A., and Farach, M. Adaptive dictionary matching. In IEEE Symposium on Foundations of Computer Science (1991), pp. 760–766.
Amir, A., Farach, M., Idury, R. M., Poutré, H. L., and Schäffer, A. A. Improved dictionary matching. Information and Computation 119 (1995), 258–282.
Apostolico, A., and Preparata, F. Optimal off-line detection of repetitions in a string. Theoretical Computer Science 22 (1983), 297–315.
Baker, B. S. A theory of parameterized pattern matching: Algorithms and applications. In ACM Symposium on Theory of Computing (1993), pp. 71–80.
Breslauer, D. The suffix tree of a tree and minimizing sequential transducers. In Combinatorial Pattern Matching (1996).
Chang, W. I., and Lawler, E. L. Sublinear approximate string matching and biological applications. Algorithmica 12 (1994), 327–344.
Cleary, J. G., Teehan, W. J., and Witten, I. H. Unbounded length contexts for PPM. In IEEE Data Compression Conference (1995), pp. 52–61.
Ferragina, P., and Grossi, R. Optimal on-line search and sublinear time update in string matching. In IEEE Symposium on Foundations of Computer Science, 604–612, 1995. Also SIAM Journal on Computing (to appear).
Fiala, E. R., and Green, D. H. Data compression with finite window. Communication of the ACM 32, 4 (1989), 490–505.
Fox, A. E., Et al. (eds.) Special Issue on “Digital Libraries” Comm. ACM, 38:4 (1995).
Frenkel, K. A. The human genome project and informatics. Communication of the ACM 34 (1991), 41–51.
Giancarlo, R. A generalization of the suffix tree to square matrices, with applications. SIAM Journal on Computing 24 (1995), 520–562.
Gu, M., Farach, M., and Beigel, R. An efficient algorithm for dynamic text indexing. In ACM-SIAM Symposium on Discrete Algorithms (1994), pp. 697–704.
Gusfield, D., Landau, G. M., and Schieber, B. An efficient algorithm for all pairs suffix-prefix problem. Information Processing Letters 41 (1992), 181–185.
Kosaraju, S.R. Efficient tree pattern matching. In IEEE Foundations of Computer Science (1989), 178–183.
Landau, G. M., and Vishkin, U. Fast parallel and serial approximate string matching. Journal of Algorithms 10 (1989), 157–169.
Larsson, N. J. Extended application of suffix trees to data compression. In IEEE Data Compression Conference (1996).
McCreight, E. M. A space-economical suffix tree construction algorithm. Journal of the ACM 23, 2 (1976), 262–272.
Overmars, M. H. The design of Dynamic Data Structures. Springer-Verlag Lecture Notes in Computer Science #156, 1983.
Sahinalp S. C. and Vishkin U. Efficient approximate and dynamic matching of patterns using a labeling paradigm. In Proc. of IEEE Symposium on Foundations of Computer Science, 1996.
Storer, J., and Szymanski, T. Data compression via textual substitution. Journal of the ACM 29, 4 (1982), 928–951.
Weiner, P. Linear pattern matching algorithm. In IEEE Symp. on Switching and Automata Theory (1973), pp. 1–11.
Ziv, J., and Lempel, A. A universal algorithm for sequential data compression. IEEE Trans. Info. Theory 23, 3 (1977), 337–343.
Ziv, J., and Lempel, A. Compression of individual sequences via variable-rate coding. IEEE Trans. Info. Theory 24, 5 (1978), 530–536.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1997 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ferragina, P., Grossi, R., Montangero, M. (1997). A note on updating suffix tree labels. In: Bongiovanni, G., Bovet, D.P., Di Battista, G. (eds) Algorithms and Complexity. CIAC 1997. Lecture Notes in Computer Science, vol 1203. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-62592-5_71
Download citation
DOI: https://doi.org/10.1007/3-540-62592-5_71
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-62592-6
Online ISBN: 978-3-540-68323-0
eBook Packages: Springer Book Archive