Skip to main content

Space Efficient Linear Time Computation of the Burrows and Wheeler-Transformation

  • Chapter

Abstract

In [4] a universal data compression algorithm (BW-algorithm, for short) is described which achieves compression rates that are close to the best known rates achieved in practice. Due to its simplicity, the algorithm can be implemented with relatively low complexity. Recently [2] modified the BW-algorithm to improve the compression rate even further. For a thorough discussion on the information theoretic background of the BW-algorithm and more references, see [1]. The most time and space consuming part of the BW-algorithm is the Burrows and Wheeler-Transformation (BWT, for short), which permutes the input string in such a way that characters with a similar context are grouped together. In [4], it was observed that for an input string of length n, this transformation can be computed in O(n) time and space using suffix trees. However, suffix trees have a reputation of being very greedy for space, and therefore most researchers resorted to alternative non-linear methods for computing the BWT: The algorithm of [9] runs in O(n log n) worst case time and it requires 8n bytes of space. The algorithm of [3] is based on Quicksort. It is fast on average, but the worst case running time is O(n 2). The Benson-Sedgewick algorithm requires 4n bytes. Its running time can be improved in practice, for the cost of 4n extra bytes. Recently, [11] showed how to combine the Manber-Myers Algorithm with the Bentley-Sedgewick Algorithm, to achieve a method running in O(n log n) worst case time and using 9n bytes.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   169.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. B. Balkenhol, S. Kurtz, “Universal Data Compression Based on the Burrows and Wheeler Transformation: Theory and Practice”, Technical Report, Sonderforschungsbereich: Diskrete Strukturen in der Mathematik, Universität Bielefeld, 98–069, 1998, http://www.mathematik.unibielefeld.de/sfb343/preprints/.

    Google Scholar 

  2. B. Balkenhol, S. Kurtz and Y. Shtarkov, “Modification of the Burrows and Wheeler Data Compression Algorithm”, In Proceedings of the IEEE Data Compression Conference, Snowbird, Utah, IEEE Computer Society Press, 1999, 188–197.

    Google Scholar 

  3. J. Bentley, R. Sedgewick, “Fast Algorithms for Sorting and Searching Strings”, In Proceedings of the ACM-SIAM Symposium on Discrete Algorithms, 1997, 360–369. http://www.cs.princeton.edu/~rs/strings/.

  4. M. Burrows, D. Wheeler, “A Block-Sorting Lossless Data Compression Algorithm”, Research Report 124, Digital Systems Research Center, 1994 http://www.gatekeeper.dec.com/pub/DEC/SRC/researchreports/abstracts/src-rr-124.html.

  5. M. Farach, “Optimal Suffix Tree Construction with Large Alphabets”. In Proceedings of the 38th Annual Symposium on the Foundations of Computer Science, FOCS 97,New York. IEEE Comput. Soc. Press, 1997. http://www.cs.rutgers.edu/pub/farach/Suffix.ps.Z.

  6. R. Giegerich, S. Kurtz, “From Ukkonen to McCreight and Weiner: A Unifying View of Linear-Time Suffix Tree Construction”. Algorithmica, 19, 1997, 331–353.

    Article  MathSciNet  MATH  Google Scholar 

  7. S. Kurtz, “Reducing the Space Requirement of Suffix Trees”. Report 98–03,Technische Fakultät, Universität Bielefeld, 1998. http://www.TechFak.Uni-Bielefeld.DE/techfak/~kurtz/publications.html.

  8. N. Larsson, “The Context Trees of Block Sorting Compression”. In Proceedings of the IEEE Data Compression Conference, Snowbird, Utah, March 30–April 1, IEEE Computer Society Press, 1998, 189–198.

    Google Scholar 

  9. U. Manbar, E. Myers, “Suffix Arrays: A New Method for On-Line String Searches”, SIAM Journal on Computing, 22 (5), 1993, 935–948.

    Article  MathSciNet  Google Scholar 

  10. E. McCreight, “A Space-Economical Suffix Tree Construction Algorithm”, Journal of the ACM, 23 (2), 1976, 262–272.

    Article  MathSciNet  MATH  Google Scholar 

  11. K. Sadakane, “A Fast Algorithm for Making Suffix Arrays and for Burrows-Wheeler Transformation”. In Proceedings of the IEEE Data Compression Conference, Snowbird, Utah, March 30–April 1, IEEE Computer Society Press, 1998, 129–138.

    Google Scholar 

  12. E. Ukkonen, “On-line Construction of Suffix-Trees”, Algorithmica, 14 (3), 1995.

    Google Scholar 

  13. P. Weiner, “Linear Pattern Matching Algorithms”. In Proceedings of the 14th IEEE Annual Symposium on Switching and Automata Theory, The Univsersity of Iowa, 1973, 1–11.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2000 Springer Science+Business Media New York

About this chapter

Cite this chapter

Kurtz, S., Balkenhol, B. (2000). Space Efficient Linear Time Computation of the Burrows and Wheeler-Transformation. In: Althöfer, I., et al. Numbers, Information and Complexity. Springer, Boston, MA. https://doi.org/10.1007/978-1-4757-6048-4_31

Download citation

  • DOI: https://doi.org/10.1007/978-1-4757-6048-4_31

  • Publisher Name: Springer, Boston, MA

  • Print ISBN: 978-1-4419-4967-7

  • Online ISBN: 978-1-4757-6048-4

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics