Linear-Time Construction of Suffix Arrays
The time complexity of suffix tree construction has been shown to be equivalent to that of sorting: O(n) for a constant-size alphabet or an integer alphabet and O(n log n) for a general alphabet. However, previous algorithms for constructing suffix arrays have the time complexity of O(n log n) even for a constant-size alphabet.
In this paper we present a linear-time algorithm to construct suffix arrays for integer alphabets, which do not use suffix trees as intermediate data structures during its construction. Since the case of a constant-size alphabet can be subsumed in that of an integer alphabet, our result implies that the time complexity of directly constructing suffix arrays matches that of constructing suffix trees.
KeywordsEquivalence Class Tree Construction Suffix Tree Limit Stage Couple Pair
Unable to display preview. Download preview PDF.
- 4.S. Burkhardt and J. Kärkkäinen, Fast lightweight suffix array construction and checking, Accepted to Symp. Combinatorial Pattern Matching (2003).Google Scholar
- 6.M. Farach, Optimal suffix tree construction with large alphabets, IEEE Symp. Found. Computer Science (1997), 137–143.Google Scholar
- 8.M. Farach and S. Muthukrishnan, Optimal logarithmic time randomized suffix tree construction, Int. Colloq. Automata Languages and Programming (1996), 550–561.Google Scholar
- 9.P. Ferragina and G. Manzini, Opportunistic data structures with applications, IEEE Symp. Found. Computer Science (2001), 390–398.Google Scholar
- 10.H.N. Gabow, J.L. Bentley, and R.E. Tarjan, Scaling and Related Techniques for Geometry Problems, ACM Symp. Theory of Computing (1984), 135–143.Google Scholar
- 11.G. Gonnet, R. Baeza-Yates, and T. Snider, New indices for text: Pat trees and pat arrays. In W. B. Frakes and R. A. Baeza-Yates, editors, Information Retrieval: Data Structures & Algorithms, Prentice Hall (1992), 66–82.Google Scholar
- 12.D. Gusfield, An “Increment-by-one” approach to suffix arrays and trees, manuscript 1990.Google Scholar
- 13.R. Grossi and J.S. Vitter, Compressed suffix arrays and suffix trees with applications to text indexing and string matching, ACM Symp. Theory of Computing (2000), 397–406.Google Scholar
- 16.J. Kärkkäinen and P. Sanders, Simpler linear work suffix array construction, Accepted to Int. Colloq. Automata Languages and Programming (2003).Google Scholar
- 17.P. Ko and S. Aluru, Space-efficient linear time construction of suffix arrays, Accepted to Symp. Combinatorial Pattern Matching (2003).Google Scholar
- 20.J. I. Munro, V. Raman and S. Srinivasa Rao Space Efficient Suffix Trees, FST & TCS 18, in Lecture Notes in Computer Science, (Springer-Verlag), Dec. 1998.Google Scholar
- 21.K. Sadakane, Succinct representation of lcp information and improvement in the compressed suffix arrays, ACM-SIAM Symp. on Discrete Algorithms (2002), 225–232.Google Scholar
- 22.S.C. Sahinalp and U. Vishkin, Symmetry breaking for suffix tree construction, IEEE Symp. Found. Computer Science (1994), 300–309.Google Scholar
- 26.P. Weiner, Linear pattern matching algorithms, Proc. 14th IEEE Symp. Switching and Automata Theory (1973), 1–11.Google Scholar