An Improved Algorithm for Sequence Comparison with Block Reversals
Given two sequences X and Y that are strings over some alphabet set, we consider the distance d(X, Y ) between them defined to be minimum number of character replacements and block (substring) reversals needed to transform X to Y (or vice versa). This is the “simplest” sequence comparison problem we know of that allows natural block edit operations. Block reversals arise naturally in genomic sequence comparison; they are also of interest in matching music data. We present an improved algorithm for exactly computing the distance d(X, Y ); it takes time O(X log2 X), and hence, is near-linear. Trivial approach takes quadratic time and the best known previous algorithm for this problem takes time ω(X log3 X).
Unable to display preview. Download preview PDF.
- [AL+95]R. Agarwal, K. Lin, H. Sawhney and K. Shim. Fast similarity search in the presence of noise, scaling and translation in time-series databases. Proc. 21st VLDB conf, 1995.Google Scholar
- [GD91]M. Gribskov and J. Devereux Sequence Analysis Primer, Stockton Press, 1991.Google Scholar
- [JKL96]M. Jackson, T. Strachan and G. Dover. Human Genome Evolution, Bios Scientific Publishers, 1996.Google Scholar
- [KMR72]R. Karp, R. Miller and A. Rosenberg, Rapid Identification of Repeated Patterns in Strings, Trees, and Arrays, Proceedings of ACM Symposium on Theory of Computing, (1972).Google Scholar
- [SK83]D. Sanko. and J. Kruskal, Time Warps, String Edits, and Macromolecules: The Theory and Practice of Sequence Comparison, Addison-Wesley, Reading, Mass., 1983.Google Scholar
- [MS00]S. Muthukrishnan and S. C. Sahinalp, Approximate Nearest Neighbors and Sequence Comparison with Block Operations, Proceedings of ACM Symposium on Theory of Computing, 2000.Google Scholar
- [St88]J. A. Storer, Data Compression, Methods and Theory. Computer Science Press, 1988.Google Scholar
- [ZL77]J. Ziv and A. Lempel, A Universal Algorithm for Sequential Data Compression IEEE Trans. on Information Theory, 337–343, 1977.Google Scholar
- [W73]P. Weiner Linear Pattern Matching Algorithms. Proc. IEEE Foundations of Computer Science (FOCS), 1–11, 1973.Google Scholar