Skip to main content

Space Efficient Suffix Trees

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1530))

Abstract

We first give a representation of a suffix tree that uses \(n \lg n + O(n)\) bits of space and supports searching for a pattern in the given text (from a fixed size alphabet) in O(m) time, where n is the size of the text and m is the size of the pattern. The structure is quite simple and answers a question raised by Muthukrishnan in [17]. Previous compact representations of suffix trees had a higher lower order term in space and had some expectation assumption [3], or required more time for searching [5]. Then, surprisingly, we show that we can even do better, by developing a structure that uses a suffix array (and so \(n \lceil \lg n \rceil \) bits) and an additional o(n) bits. String searching can be done in this structure also in O(m) time. Besides supporting string searching, we can also report the number of occurrences of the pattern in the same time using no additional space. In this case the space occupied by the structures is much less compared to many of the previously known structures to do this. When the size of the alphabet k is not a constant, our structures can be easily extended, using standard tricks, to those that use the same space but take \(O(m \lg k)\) time for string searching or to those that use an additional \(O(m \lg k)\) bits but take the same O(m) time for searching.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   74.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Apostolico, A., Preparata, F.P.: Structural properties of the string statistics problem. Journal of Computer and System Sciences 31, 394–411 (1985)

    Google Scholar 

  2. Cardenas, A.F.: Analysis and performance of inverted data base structures. Communications of The ACM 18(5), 253–263 (1975)

    Article  MATH  MathSciNet  Google Scholar 

  3. Clark, D.R., Munro, J.I.: Efficient Suffix Trees on Secondary Storage. In: Proceedings of the 7th ACM-SIAM Symposium on Discrete Algorithms, pp. 383–391 (1996)

    Google Scholar 

  4. Clift, B., Haussler, D., McConnel, R., Schneider, T.D., Stormo, G.D.: Sequence landscapes. Nucleic Acids Research 4(1), 141–158 (1986)

    Article  Google Scholar 

  5. Colussi, L., De Col, A.: A time and space efficient data structure for string searching on large texts. Information Processing Letters 58, 217–222 (1996)

    Article  MATH  MathSciNet  Google Scholar 

  6. Fraser, C., Wendt, A., Myers, E.W.: Analysing and compressing assembly code. In: Proceedings of the SIGPLAN Symposium on Compiler Construction (1984)

    Google Scholar 

  7. Gonnet, G.H., Baeza-Yates, R.A., Snider, T.: New indices for text: PAT trees and PAT arrays. In: Frakes, W.B., Baeza-Yates, R. (eds.) Information Retrieval: Data Structures and Algorithms, pp. 66–82. Prentice-Hall, Englewood Cliffs (1992)

    Google Scholar 

  8. Jacobson, G.: Space-efficient Static Trees and Graphs. In: Proceedings of the IEEE Symposium on Foundations of Computer Science, pp. 549–554 (1989)

    Google Scholar 

  9. Kärkkäinen, J., Ukkonen, E.: Sparse suffix trees. In: Cai, J.-Y., Wong, C.K. (eds.) COCOON 1996. LNCS, vol. 1090, pp. 219–230. Springer, Heidelberg (1996)

    Google Scholar 

  10. Landau, G.M., Vishkin, U.: Introducing efficient parallelism into approximate string matching. In: Proc. 18th ACM Symposium on Theory of Computing, pp. 220–230 (1986)

    Google Scholar 

  11. Manber, U., Myers, G.: Suffix Arrays: A New Method for On-line String Searches. SIAM Journal on Computing 22(5), 935–948 (1993)

    Article  MATH  MathSciNet  Google Scholar 

  12. McCreight, M.E.: A space-economical suffix tree construction algorithm. Journal of the ACM 23, 262–272 (1976)

    Article  MATH  MathSciNet  Google Scholar 

  13. Morrison, D.R.: PATRICIA: Practical Algorithm To Retrieve Information Coded In Alphanumeric. Journal of the ACM 15, 514–534 (1968)

    Article  Google Scholar 

  14. Munro, J.I., Benoit, D.: Succinct Representation of k-ary trees. Manuscript

    Google Scholar 

  15. Munro, J.I.: Tables. In: Chandru, V., Vinay, V. (eds.) FSTTCS 1996. LNCS, vol. 1180, pp. 37–42. Springer, Heidelberg (1996)

    Google Scholar 

  16. Munro, J.I., Raman, V.: Succinct representation of balanced parentheses, static trees and planar graphs. In: Proceedings of the IEEE Symposium on Foundations of Computer Science, pp. 118–126 (1997)

    Google Scholar 

  17. Muthukrishnan, S.: Randomization in Stringology. In: Proceedings of the Preconference Workshop on Randomization, Kharagpur, India (December 1997)

    Google Scholar 

  18. Rodeh, M., Pratt, V.R., Even, S.: Linear algorithm for data compression via string matching. Journal of the ACM 28(1), 16–24 (1991)

    Article  MathSciNet  Google Scholar 

  19. Shang, H.: Trie methods for text and spatial data structures on secondary storage, PhD Thesis, McGill University (1995)

    Google Scholar 

  20. Weiner, P.: Linear pattern matching algorithm. In: Proc. 14th IEEE Symposium on Switching and Automata Theory, pp. 1–11 (1973)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1998 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Munro, I., Raman, V., Rao, S.S. (1998). Space Efficient Suffix Trees. In: Arvind, V., Ramanujam, S. (eds) Foundations of Software Technology and Theoretical Computer Science. FSTTCS 1998. Lecture Notes in Computer Science, vol 1530. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-49382-2_17

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-49382-2_17

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-65384-4

  • Online ISBN: 978-3-540-49382-2

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics