Space Efficient Suffix Trees

Munro, Ian; Raman, Venkatesh; Rao, S. Srinivasa

doi:10.1007/978-3-540-49382-2_17

Space Efficient Suffix Trees

Ian Munro⁶,
Venkatesh Raman⁷ &
S. Srinivasa Rao⁷

Conference paper

377 Accesses
4 Citations

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1530))

Abstract

We first give a representation of a suffix tree that uses \(n \lg n + O(n)\) bits of space and supports searching for a pattern in the given text (from a fixed size alphabet) in O(m) time, where n is the size of the text and m is the size of the pattern. The structure is quite simple and answers a question raised by Muthukrishnan in [17]. Previous compact representations of suffix trees had a higher lower order term in space and had some expectation assumption [3], or required more time for searching [5]. Then, surprisingly, we show that we can even do better, by developing a structure that uses a suffix array (and so \(n \lceil \lg n \rceil \) bits) and an additional o(n) bits. String searching can be done in this structure also in O(m) time. Besides supporting string searching, we can also report the number of occurrences of the pattern in the same time using no additional space. In this case the space occupied by the structures is much less compared to many of the previously known structures to do this. When the size of the alphabet k is not a constant, our structures can be easily extended, using standard tricks, to those that use the same space but take \(O(m \lg k)\) time for string searching or to those that use an additional \(O(m \lg k)\) bits but take the same O(m) time for searching.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 74.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Apostolico, A., Preparata, F.P.: Structural properties of the string statistics problem. Journal of Computer and System Sciences 31, 394–411 (1985)
Google Scholar
Cardenas, A.F.: Analysis and performance of inverted data base structures. Communications of The ACM 18(5), 253–263 (1975)
Article MATH MathSciNet Google Scholar
Clark, D.R., Munro, J.I.: Efficient Suffix Trees on Secondary Storage. In: Proceedings of the 7th ACM-SIAM Symposium on Discrete Algorithms, pp. 383–391 (1996)
Google Scholar
Clift, B., Haussler, D., McConnel, R., Schneider, T.D., Stormo, G.D.: Sequence landscapes. Nucleic Acids Research 4(1), 141–158 (1986)
Article Google Scholar
Colussi, L., De Col, A.: A time and space efficient data structure for string searching on large texts. Information Processing Letters 58, 217–222 (1996)
Article MATH MathSciNet Google Scholar
Fraser, C., Wendt, A., Myers, E.W.: Analysing and compressing assembly code. In: Proceedings of the SIGPLAN Symposium on Compiler Construction (1984)
Google Scholar
Gonnet, G.H., Baeza-Yates, R.A., Snider, T.: New indices for text: PAT trees and PAT arrays. In: Frakes, W.B., Baeza-Yates, R. (eds.) Information Retrieval: Data Structures and Algorithms, pp. 66–82. Prentice-Hall, Englewood Cliffs (1992)
Google Scholar
Jacobson, G.: Space-efficient Static Trees and Graphs. In: Proceedings of the IEEE Symposium on Foundations of Computer Science, pp. 549–554 (1989)
Google Scholar
Kärkkäinen, J., Ukkonen, E.: Sparse suffix trees. In: Cai, J.-Y., Wong, C.K. (eds.) COCOON 1996. LNCS, vol. 1090, pp. 219–230. Springer, Heidelberg (1996)
Google Scholar
Landau, G.M., Vishkin, U.: Introducing efficient parallelism into approximate string matching. In: Proc. 18th ACM Symposium on Theory of Computing, pp. 220–230 (1986)
Google Scholar
Manber, U., Myers, G.: Suffix Arrays: A New Method for On-line String Searches. SIAM Journal on Computing 22(5), 935–948 (1993)
Article MATH MathSciNet Google Scholar
McCreight, M.E.: A space-economical suffix tree construction algorithm. Journal of the ACM 23, 262–272 (1976)
Article MATH MathSciNet Google Scholar
Morrison, D.R.: PATRICIA: Practical Algorithm To Retrieve Information Coded In Alphanumeric. Journal of the ACM 15, 514–534 (1968)
Article Google Scholar
Munro, J.I., Benoit, D.: Succinct Representation of k-ary trees. Manuscript
Google Scholar
Munro, J.I.: Tables. In: Chandru, V., Vinay, V. (eds.) FSTTCS 1996. LNCS, vol. 1180, pp. 37–42. Springer, Heidelberg (1996)
Google Scholar
Munro, J.I., Raman, V.: Succinct representation of balanced parentheses, static trees and planar graphs. In: Proceedings of the IEEE Symposium on Foundations of Computer Science, pp. 118–126 (1997)
Google Scholar
Muthukrishnan, S.: Randomization in Stringology. In: Proceedings of the Preconference Workshop on Randomization, Kharagpur, India (December 1997)
Google Scholar
Rodeh, M., Pratt, V.R., Even, S.: Linear algorithm for data compression via string matching. Journal of the ACM 28(1), 16–24 (1991)
Article MathSciNet Google Scholar
Shang, H.: Trie methods for text and spatial data structures on secondary storage, PhD Thesis, McGill University (1995)
Google Scholar
Weiner, P.: Linear pattern matching algorithm. In: Proc. 14th IEEE Symposium on Switching and Automata Theory, pp. 1–11 (1973)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, University of Waterloo, Canada, N2L 3G1
Ian Munro
The Institute of Mathematical Sciences, Chennai, India, 600113
Venkatesh Raman & S. Srinivasa Rao

Authors

Ian Munro
View author publications
You can also search for this author in PubMed Google Scholar
Venkatesh Raman
View author publications
You can also search for this author in PubMed Google Scholar
S. Srinivasa Rao
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

The Institute of Mathematical Sciences, 600 113, Chennai, India
Vikraman Arvind
National Institute of Advanced Studies, Indian Institute of Science Campus, 560012, Bangalore, India
Sundar Ramanujam

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Munro, I., Raman, V., Rao, S.S. (1998). Space Efficient Suffix Trees. In: Arvind, V., Ramanujam, S. (eds) Foundations of Software Technology and Theoretical Computer Science. FSTTCS 1998. Lecture Notes in Computer Science, vol 1530. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-49382-2_17

Download citation

DOI: https://doi.org/10.1007/978-3-540-49382-2_17
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-65384-4
Online ISBN: 978-3-540-49382-2
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics