Skip to main content

Sparse Suffix Tree Construction in Small Space

  • Conference paper
Automata, Languages, and Programming (ICALP 2013)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 7965))

Included in the following conference series:

Abstract

We consider the problem of constructing a sparse suffix tree (or suffix array) for b suffixes of a given text T of length n, using only O(b) words of space during construction. Attempts at breaking the naive bound of Ω(nb) time for this problem can be traced back to the origins of string indexing in 1968. First results were only obtained in 1996, but only for the case where the suffixes were evenly spaced in T. In this paper there is no constraint on the locations of the suffixes.

We show that the sparse suffix tree can be constructed in O(nlog2 b) time. To achieve this we develop a technique, which may be of independent interest, that allows to efficiently answer b longest common prefix queries on suffixes of T, using only O(b) space. We expect that this technique will prove useful in many other applications in which space usage is a concern. Our first solution is Monte-Carlo and outputs the correct tree with high probability. We then give a Las-Vegas algorithm which also uses O(b) space and runs in the same time bounds with high probability when \(b = O(\sqrt{n})\). Furthermore, additional tradeoffs between the space usage and the construction time for the Monte-Carlo algorithm are given.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Ajtai, M., Komlós, J., Szemerédi, E.: An O(n logn) Sorting Network. In: Proc. 15th STOC, pp. 1–9 (1983)

    Google Scholar 

  2. Andersson, A., Larsson, N.J., Swanson, K.: Suffix Trees on Words. In: Hirschberg, D.S., Meyers, G. (eds.) CPM 1996. LNCS, vol. 1075, pp. 102–115. Springer, Heidelberg (1996)

    Chapter  Google Scholar 

  3. Andersson, A., Larsson, N.J., Swanson, K.: Suffix Trees on Words. Algorithmica 23(3), 246–260 (1999)

    Article  MathSciNet  MATH  Google Scholar 

  4. Batcher, K.E.: Sorting Networks and Their Applications. In: Proc. AFIPS Spring JCC, pp. 307–314 (1968)

    Google Scholar 

  5. Bentley, J.L., Sedgewick, R.: Fast algorithms for sorting and searching strings. In: Proc. 8th SODA, pp. 360–369 (1997)

    Google Scholar 

  6. Burkhardt, S., Kärkkäinen, J.: Fast Lightweight Suffix Array Construction and Checking. In: Baeza-Yates, R., Chávez, E., Crochemore, M. (eds.) CPM 2003. LNCS, vol. 2676, pp. 55–69. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  7. Ferragina, P., Fischer, J.: Suffix Arrays on Words. In: Ma, B., Zhang, K. (eds.) CPM 2007. LNCS, vol. 4580, pp. 328–339. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  8. Fine, N.J., Wilf, H.S.: Uniqueness Theorems for Periodic Functions. Proc. AMS 16(1), 109–114 (1965)

    Article  MathSciNet  MATH  Google Scholar 

  9. Inenaga, S., Takeda, M.: On-line linear-time construction of word suffix trees. In: Lewenstein, M., Valiente, G. (eds.) CPM 2006. LNCS, vol. 4009, pp. 60–71. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  10. Kärkkäinen, J., Ukkonen, E.: Sparse Suffix Trees. In: Cai, J.-Y., Wong, C.K. (eds.) COCOON 1996. LNCS, vol. 1090, pp. 219–230. Springer, Heidelberg (1996)

    Chapter  Google Scholar 

  11. Karp, R.M., Rabin, M.O.: Efficient Randomized Pattern-Matching Algorithms. IBM J. Res. Dev. 31(2), 249–260 (1987)

    Article  MathSciNet  MATH  Google Scholar 

  12. Kasai, T., Lee, G., Arimura, H., Arikawa, S., Park, K.: Linear-Time Longest-Common-Prefix Computation in Suffix Arrays and Its Applications. In: Amir, A., Landau, G.M. (eds.) CPM 2001. LNCS, vol. 2089, pp. 181–192. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  13. Kolpakov, R., Kucherov, G., Starikovskaya, T.A.: Pattern Matching on Sparse Suffix Trees. In: Proc. 1st CCP, pp. 92–97 (2011)

    Google Scholar 

  14. Manber, U., Myers, G.: Suffix Arrays: A New Method for On-Line String Searches. SIAM J. Comput. 22(5), 935–948 (1993)

    Article  MathSciNet  MATH  Google Scholar 

  15. Morrison, D.R.: Patricia-practical algorithm to retrieve information coded in alphanumeric. J. ACM 15(4), 514–534 (1968)

    Article  Google Scholar 

  16. Paterson, M.: Improved Sorting Networks with O(logN) Depth. Algorithmica 5(1), 65–92 (1990)

    MathSciNet  Google Scholar 

  17. Uemura, T., Arimura, H.: Sparse and truncated suffix trees on variable-length codes. In: Giancarlo, R., Manzini, G. (eds.) CPM 2011. LNCS, vol. 6661, pp. 246–260. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  18. Weiner, P.: Linear Pattern Matching Algorithms. In: Proc. 14th FOCS (SWAT), pp. 1–11 (1973)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Bille, P., Fischer, J., Gørtz, I.L., Kopelowitz, T., Sach, B., Vildhøj, H.W. (2013). Sparse Suffix Tree Construction in Small Space. In: Fomin, F.V., Freivalds, R., Kwiatkowska, M., Peleg, D. (eds) Automata, Languages, and Programming. ICALP 2013. Lecture Notes in Computer Science, vol 7965. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-39206-1_13

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-39206-1_13

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-39205-4

  • Online ISBN: 978-3-642-39206-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics