Abstract
Sparse Dynamic Programming has emerged as an essential tool for the design of efficient algorithms for optimization problems coming from such diverse areas as Computer Science, Computational Biology and Speech Recognition [7],[11],[15]. We provide a new Sparse Dynamic Programming technique that extends the Hunt-Szymanski [2],[9],[8] paradigm for the computation of the Longest Common Subsequence (LCS) and apply it to solve the LCS from Fragments problem: given a pair of strings X and Y (of length n and m, resp.) and a set M of matching substrings of X and Y, find the longest common subsequence based only on the symbol correspondences induced by the substrings. This problem arises in an application to analysis of software systems. Our algorithm solves the problem in O(|M| log |M|) time using balanced trees, or O(|M| log log min(|M|,nm/|M|)) time using Johnson’s version of Flat Trees [10]. These bounds apply for two cost measures. The algorithm can also be adapted to finding the usual LCS in O((m + n) log |Σ| + |M|log|M|) using balanced trees or O((m + n)log|Σ| + |M|log log min(|M|; nm/|M|)) using Johnson’s Flat Trees, where M is the set of maximal matches between substrings of X and Y and Σ is the alphabet.
Work Supported in part by Grants from the Italian Ministry of Scientific Research and by the Italian National Research Council. Part of this work was done while the author was visiting Bell Laboratories of Lucent Technologies
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
A.V. Aho, J.E. Hopcroft, and J.D. Ullman. Data Structures and Algorithms. Addison-Wesley, Reading, MA., 1983.
A. Apostolico. String editing and longest common subsequence. In G. Rozenberg and A. Salomaa, editors, Handbook of Formal Languages, Vol. 2, pages 361–398, Berlin, 1997. Springer Verlag.
B. S. Baker. A theory of parameterized pattern matching: Algorithms and applications. In Proc. 25th Symposium on Theory of Computing, pages 71–80. ACM, 1993.
D. Eppstein, Z. Galil, R. Giancarlo, and G. Italiano. Sparse dynamic programming I: Linear cost functions. J. of ACM, 39:519–545, 1992.
D. Eppstein, Z. Galil, R. Giancarlo, and G. Italiano. Sparse dynamic programming II: Convex and concave cost functions. J. of ACM, 39:546–567, 1992.
M. Farach and M. Thorup. Optimal evolutionary tree comparison by sparse dynamic programming. In Proc. 35th Symposium on Foundations of Computer Science, pages 770–779. IEEE, 1994.
D. Gusfield. Algorithms on Strings, Trees and Sequences-Computer Science and Computational Biology. Cambridge University Press, Cambridge, 1997.
D.S. Hirschberg. Serial computations of Levenshtein distances. In A. Apostolico and Z. Galil, editors, Pattern Matching Algorithms, pages 123–142, Oxford, 1997. Oxford University Press.
J.W. Hunt and T.G. Szymanski. A fast algorithm for computing longest common subsequences. Comm. of the ACM, 20:350–353, 1977.
D. B. Johnson. A priority queue in which initialization and queue operations take O(log logD) time. Math. Sys. Th., 15:295–309, 1982.
J.B. Kruskal and D. Sankoff, editors. Time Wraps, String Edits, and Macromolecules: The Theory and Practice of Sequence Comparison. Addison-Wesley, 1983.
W. Miller and E. Myers. Chaining multiple alignment fragments in sub-quadratic time. In Proc. of 6-th ACM-SIAM SODA, pages 48–57, 1995.
E. W. Myers. An O(ND) difference algorithm and its variations. Algorithmica, 1:251–266, 1986.
P. van Emde Boas. Preserving order in a forest in less than logarithmic time. Info. Proc. Lett., 6:80–82, 1977.
M.S. Waterman. Introduction to Computational Biology. Maps, Sequences and Genomes. Chapman Hall, Los Angeles, 1995.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1998 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Baker, B.S., Giancarlo, R. (1998). Longest Common Subsequence from Fragments via Sparse Dynamic Programming. In: Bilardi, G., Italiano, G.F., Pietracaprina, A., Pucci, G. (eds) Algorithms — ESA’ 98. ESA 1998. Lecture Notes in Computer Science, vol 1461. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-68530-8_7
Download citation
DOI: https://doi.org/10.1007/3-540-68530-8_7
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-64848-2
Online ISBN: 978-3-540-68530-2
eBook Packages: Springer Book Archive