Numbers, Information and Complexity pp 375-383 | Cite as

# Space Efficient Linear Time Computation of the Burrows and Wheeler-Transformation

## Abstract

In [4] a universal data compression algorithm (BW-algorithm, for short) is described which achieves compression rates that are close to the best known rates achieved in practice. Due to its simplicity, the algorithm can be implemented with relatively low complexity. Recently [2] modified the BW-algorithm to improve the compression rate even further. For a thorough discussion on the information theoretic background of the BW-algorithm and more references, see [1]. The most time and space consuming part of the BW-algorithm is the Burrows and Wheeler-Transformation (BWT, for short), which permutes the input string in such a way that characters with a similar context are grouped together. In [4], it was observed that for an input string of length *n*, this transformation can be computed in *O*(*n*) time and space using suffix trees. However, suffix trees have a reputation of being very greedy for space, and therefore most researchers resorted to alternative non-linear methods for computing the BWT: The algorithm of [9] runs in *O*(*n* log *n*) worst case time and it requires 8*n* bytes of space. The algorithm of [3] is based on *Quicksort*. It is fast on average, but the worst case running time is *O*(*n* ^{2}). The Benson-Sedgewick algorithm requires 4*n* bytes. Its running time can be improved in practice, for the cost of 4*n* extra bytes. Recently, [11] showed how to combine the Manber-Myers Algorithm with the Bentley-Sedgewick Algorithm, to achieve a method running in *O*(*n* log *n*) worst case time and using 9*n* bytes.

## Keywords

Head Position Large Node Implementation Technique Suffix Tree Input String## Preview

Unable to display preview. Download preview PDF.

## References

- [1]B. Balkenhol, S. Kurtz, “Universal Data Compression Based on the Burrows and Wheeler Transformation: Theory and Practice”,
*Technical Report, Sonderforschungsbereich: Diskrete Strukturen in der Mathematik, Universität Bielefeld, 98–069*, 1998, http://www.mathematik.unibielefeld.de/sfb343/preprints/.Google Scholar - [2]B. Balkenhol, S. Kurtz and Y. Shtarkov, “Modification of the Burrows and Wheeler Data Compression Algorithm”, In
*Proceedings of the IEEE Data Compression Conference*,*Snowbird*,*Utah*, IEEE Computer Society Press, 1999, 188–197.Google Scholar - [3]J. Bentley, R. Sedgewick, “Fast Algorithms for Sorting and Searching Strings”, In
*Proceedings of the ACM-SIAM Symposium on Discrete Algorithms*, 1997, 360–369. http://www.cs.princeton.edu/~rs/strings/. - [4]M. Burrows, D. Wheeler, “A Block-Sorting Lossless Data Compression Algorithm”,
*Research Report 124*, Digital Systems Research Center, 1994 http://www.gatekeeper.dec.com/pub/DEC/SRC/researchreports/abstracts/src-rr-124.html. - [5]M. Farach, “Optimal Suffix Tree Construction with Large Alphabets”. In
*Proceedings of the 38th Annual Symposium on the Foundations of Computer Science*,*FOCS 97*,New York. IEEE Comput. Soc. Press, 1997. http://www.cs.rutgers.edu/pub/farach/Suffix.ps.Z. - [6]R. Giegerich, S. Kurtz, “From Ukkonen to McCreight and Weiner: A Unifying View of Linear-Time Suffix Tree Construction”.
*Algorithmica*,**19**, 1997, 331–353.MathSciNetzbMATHCrossRefGoogle Scholar - [7]S. Kurtz, “Reducing the Space Requirement of Suffix Trees”.
*Report 98–03*,Technische Fakultät, Universität Bielefeld, 1998. http://www.TechFak.Uni-Bielefeld.DE/techfak/~kurtz/publications.html. - [8]N. Larsson, “The Context Trees of Block Sorting Compression”. In
*Proceedings of the IEEE Data Compression Conference, Snowbird, Utah, March**30–April 1*, IEEE Computer Society Press, 1998, 189–198.Google Scholar - [9]U. Manbar, E. Myers, “Suffix Arrays: A New Method for On-Line String Searches”,
*SIAM Journal on Computing*,**22**(5), 1993, 935–948.MathSciNetCrossRefGoogle Scholar - [10]E. McCreight, “A Space-Economical Suffix Tree Construction Algorithm”,
*Journal of the ACM*,**23**(2), 1976, 262–272.MathSciNetzbMATHCrossRefGoogle Scholar - [11]K. Sadakane, “A Fast Algorithm for Making Suffix Arrays and for Burrows-Wheeler Transformation”. In
*Proceedings of the IEEE Data Compression Conferenc*e, Snowbird, Utah, March 30–April 1, IEEE Computer Society Press, 1998, 129–138.Google Scholar - [12]
- [13]P. Weiner, “Linear Pattern Matching Algorithms”. In
*Proceedings of the 14th IEEE Annual Symposium on Switching and Automata Theory*, The Univsersity of Iowa, 1973, 1–11.Google Scholar