Damaged BZip Files Are Difficult to Repair
bzip is a program written by Julian Seward that is often used under Unix to compress single files. It splits the file into blocks which are compressed individually using a combination of the Burrows-Wheeler-Transformation, the Move-To-Front algorithm, Huffman and Runlength encoding. The author himself stated that compressed blocks that are damaged, i.e., part of which are lost, are essentially non-recoverable. This paper gives a formal proof that this is indeed true: focusing on the Burrows-Wheeler-Transformation, the problem of completing a transformed string, such that the decoded string obeys certain file format restrictions, is NP-hard.
KeywordsBipartite Graph Hamiltonian Cycle Original Graph Outgoing Edge Input String
Unable to display preview. Download preview PDF.
- 1.Brandstädt, A., Le, V.B., Spinrad, J.P.: Graph Classes; A Survey. SIAM Monographs on Discrete Mathematics and Applications (1999)Google Scholar
- 2.Burrows, M., Wheeler, D.J.: A Block-sorting Lossless Data Compression Algorithm. SRC Research Report (1994)Google Scholar
- 4.Hopcroft, J.E., Motwani, R., Ullman, J.D.: Introduction to Automata Theory, Languages, and Computation. Addison-Wesley, Reading (2000)Google Scholar
- 5.Seward, J.: - The official BZip Homepage, http://www.bzip.org