Skip to main content

Space-Efficient Construction of the Burrows-Wheeler Transform

  • Conference paper
Book cover String Processing and Information Retrieval (SPIRE 2013)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 8214))

Included in the following conference series:

Abstract

The Burrows-Wheeler transform (BWT), originally invented for data compression, is nowadays also the core of many self-indexes, which can be used to solve many problems in bioinformatics. However, the memory requirement during the construction of the BWT is often the bottleneck in applications in the bioinformatics domain.

In this paper, we present a linear-time semi-external algorithm whose memory requirement is only about one byte per input symbol. Our experiments show that this algorithm provides a new time-memory trade-off between external and in-memory construction algorithms.

The original version of this chapter was revised: The copyright line was incorrect. This has been corrected. The Erratum to this chapter is available at DOI: 10.1007/978-3-319-02432-5_33

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bauer, M.J., Cox, A.J., Rosone, G.: Lightweight algorithms for constructing and inverting the BWT of string collections. Theoretical Computer Science 483, 134–148 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  2. Beller, T., Gog, S., Ohlebusch, E., Schnattinger, T.: Computing the longest common prefix array based on the Burrows-Wheeler transform. Journal of Discrete Algorithms 18, 22–31 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  3. Bentley, J.L., Sedgewick, R.: Fast algorithms for sorting and searching strings. In: Proc. 8th Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 360–369 (1997)

    Google Scholar 

  4. Bingmann, T., Fischer, J., Osipov, V.: Inducing suffix and lcp arrays in external memory. In: Proc. Wkshp. Algorithm Engineering and Experiments (2013)

    Google Scholar 

  5. Burrows, M., Wheeler, D.J.: A block-sorting lossless data compression algorithm. Research Report 124, Digital Systems Research Center (1994)

    Google Scholar 

  6. Dementiev, R., Kärkkäinen, J., Mehnert, J., Sanders, P.: Better external memory suffix array construction. Journal of Experimental Algorithmics 12, Article No. 3.4 (2008)

    Google Scholar 

  7. Ferragina, P., Manzini, G.: Opportunistic data structures with applications. In: Proc. IEEE Symposium on Foundations of Computer Science, pp. 390–398 (2000)

    Google Scholar 

  8. Ferragina, P., Gagie, T., Manzini, G.: Lightweight data indexing and compression in external memory. Algorithmica 63(3), 707–730 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  9. Gog, S.: Compressed Suffix Trees: Design, Construction, and Applications. PhD thesis, University of Ulm, Germany (2011)

    Google Scholar 

  10. Jacobson, G.: Space-efficient static trees and graphs. In: Proc. 30th Annual Symposium on Foundations of Computer Science, pp. 549–554. IEEE (1989)

    Google Scholar 

  11. Kärkkäinen, J.: Fast BWT in small space by blockwise suffix sorting. Theoretical Computer Science 387(3), 249–257 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  12. Larsson, J., Sadakane, K.: Faster suffix sorting. Theoretical Computer Science 387(3), 258–272 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  13. Lippert, R.A., Mobarry, C.M., Walenz, B.P.: A space-efficient construction of the Burrows-Wheeler transform for genomic data. Journal of Computational Biology 12(7), 943–951 (2005)

    Article  Google Scholar 

  14. Navarro, G., Mäkinen, V.: Compressed full-text indexes. ACM Computing Surveys 39(1), Article No. 2 (2007)

    Google Scholar 

  15. Nong, G., Zhang, S., Chan, W.: Linear suffix array construction by almost pure induced-sorting. In: Proc. Data Compression Conference, pp. 193–202 (2009)

    Google Scholar 

  16. Nong, G., Zhang, S., Chan, W.: Two efficient algorithms for linear time suffix array construction. IEEE Transactions on Computers 60(10), 1471–1484 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  17. G. Nong Practical Linear-Time O(1)-Workspace Suffix Sorting for Constant Alphabets. ACM Transactions on Information Systems (to appear, July 2013)

    Google Scholar 

  18. Okanohara, D., Sadakane, K.: A linear-time Burrows-Wheeler transform using induced sorting. In: Karlgren, J., Tarhio, J., Hyyrö, H. (eds.) SPIRE 2009. LNCS, vol. 5721, pp. 90–101. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  19. Puglisi, S.J., Smyth, W.F., Turpin, A.: A taxonomy of suffix array construction algorithms. ACM Computing Surveys 39(2), Article No. 4 (2007)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Beller, T., Zwerger, M., Gog, S., Ohlebusch, E. (2013). Space-Efficient Construction of the Burrows-Wheeler Transform. In: Kurland, O., Lewenstein, M., Porat, E. (eds) String Processing and Information Retrieval. SPIRE 2013. Lecture Notes in Computer Science, vol 8214. Springer, Cham. https://doi.org/10.1007/978-3-319-02432-5_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-02432-5_5

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-02431-8

  • Online ISBN: 978-3-319-02432-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics