# Constructing the Suffix Tree of a Tree with a Large Alphabet

## Abstract

The problem of constructing the suffix tree of a common suffix tree (CS-tree) is a generalization of the problem of constructing the suffix tree of a string. It has many applications, such as in minimizing the size of sequential transducers and in tree pattern matching. The best-known algorithm for this problem is Breslauer’s *O*(*n* log |∑|) time algorithm where n is the size of the CS-tree and |∑| is the alphabet size, which requires *O*(*n* log *n*) time if |∑| is large. We improve this bound by giving an *O*(*n* log *n*) algorithm for integer alphabets. For trees called shallow *k*-ary trees, we give an optimal linear time algorithm. We also describe a new data structure, the Bsuffix tree, which enables efficient query for patterns of completely balanced k-ary trees from a *k*-ary tree or forest. We also propose an optimal *O*(*n*) algorithm for constructing the Bsuffix tree for integer alphabets.

## Keywords

Binary Number Edge Label Suffix Tree Large Alphabet Balance Binary Tree## Preview

Unable to display preview. Download preview PDF.

## References

- 1.A. Apostolico and Z. Galil, eds., “Pattern Matching Algorithms”,
*Oxford University Press, New York*, 1997.Google Scholar - 2.D. Breslauer, “The Suffix Tree of a Tree and Minimizing Sequential Transducers,”
*J. Theoretical Computer Science*, Vol. 191, 1998, pp. 131–144.zbMATHCrossRefMathSciNetGoogle Scholar - 3.M. T. Chen and J. Seiferas, “Efficient and Elegant Subword Tree Construction,” A. Apostolico and Z. Galil, eds.,
*Combinatorial Algorithms on Words, Chapter 12*, NATO ASI Series F: Computer and System Sciences, 1985, pp. 97–107.Google Scholar - 4.R. Cole, R. Hariharan and P. Indyk, “Tree Pattern Matching and Subset Matching in Deterministic O(n log3 n)-time,”
*Proc. 4th Symposium on Discrete Mathematics (SODA’ 99)*, 1999, pp. 245–254.Google Scholar - 5.M. Farach, “Optimal Suffix Tree Construction with Large Alphabets,”
*Proc. 38th IEEE Symp. Foundations of Computer Science (FOCS’ 97)*, 1997, pp. 137–143.Google Scholar - 6.M. Farach and S. Muthukrishnan, “Optimal Logarithmic Time Randomized Suffix Tree Construction,”
*Proc. 23rd International Colloquium on Automata Languages and Programming (ICALP’ 96)*, 1996, pp. 550–561.Google Scholar - 7.R. Giancarlo, “The Suffix Tree of a Square Matrix, with Applications,”
*Proc. 4th Symposium on Discrete Mathematics (SODA’ 93)*, 1993, pp. 402–411.Google Scholar - 8.D. Gusfield, “Algorithms on Strings, Trees, and Sequences: Computer Science and Computational Biology,”
*Cambridge University Press*, 1997.Google Scholar - 9.D. Harel and R. R. Tarjan, “Fast Algorithms for Finding Nearest Common Ancestors,”
*SIAM J. Computing*, Vol. 13, 1984, pp. 338–355.zbMATHCrossRefMathSciNetGoogle Scholar - 10.S. R. Kosaraju, “Efficient Tree Pattern Matching,”
*Proc. 30th IEEE Symp. Foundations of Computer Science (FOCS’ 89)*, 1989, pp. 178–183.Google Scholar - 11.E. M. McCreight, “A Space-Economical Suffix Tree Construction Algorithm,”
*J. ACM*, Vol. 23, 1976, pp. 262–272.zbMATHCrossRefMathSciNetGoogle Scholar - 12.E. Ukkonen, “On-Line Construction of Suffix-Trees,”
*Algorithmica*, Vol. 14, 1995, pp. 249–60.zbMATHCrossRefMathSciNetGoogle Scholar - 13.P. Weiner, “Linear Pattern Matching Algorithms,”
*Proc. 14th Symposium on Switching and Automata Theory*, 1973, pp. 1–11.Google Scholar