Coders for the Burrows-Wheeler Transform

doi:10.1007/978-0-387-78909-5_3

Coders for the Burrows-Wheeler Transform

Chapter

1490 Accesses

Like most transforms, the Burrows-Wheeler Transform does not change the size of the file that has been transformed, but merely rearranges it so that it will be easier to represent it compactly. It then needs to be coded using a second phase which we will refer to as the “Local to Global Transform” (LGT). Figure 3.1 shows a section of the transformed text for Shakespeare's “Hamlet”, which reveals the kind of regularities that the BWT exposes. These characters are ones that appear before the context nd; initially the nd is followed by a space, and hence a is very common, but then the character is followed by ndeed, where the i becomes common, and the last few characters precede nder.

Clearly the text in Figure 3.1 contains a lot of patterns, and therefore will be easy to compress. Many sophisticated techniques have been proposed to exploit the regularities of the BWT transformed text, and yet it has emerged that one of the simplest approaches (RleAc, based on run-length encoding and an order-zero arithmetic coder) gives the best compression and is also very fast compared with more complicated methods. We will begin this section by looking at this simple coder, but later we will also review various other approaches that have been proposed, including Burrows and Wheeler's original “Move to Front” (MTF) list, inversion frequencies, distance coding, frequency counting methods, wavelet trees, and alternative permutations. We will also consider the effect of the block size on compression performance.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

(2008). Coders for the Burrows-Wheeler Transform. In: The Burrows-Wheeler Transform: Data Compression, Suffix Arrays, and Pattern Matching. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-78909-5_3

Download citation

DOI: https://doi.org/10.1007/978-0-387-78909-5_3
Publisher Name: Springer, Boston, MA
Print ISBN: 978-0-387-78908-8
Online ISBN: 978-0-387-78909-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Buying options