Two algorithms for the longest common subsequence of three (or more) strings

Irving, Robert W.; Fraser, Campbell B.

doi:10.1007/3-540-56024-6_18

Robert W. Irving¹ &
Campbell B. Fraser¹

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 644))

Included in the following conference series:

Annual Symposium on Combinatorial Pattern Matching

646 Accesses
20 Citations

Abstract

Various algorithms have been proposed, over the years, for the longest common subsequence problem on 2 strings (2-LCS), many of these improving, at least for some cases, on the classical dynamic programming approach. However, relatively little attention has been paid in the literature to the k-LCS problem for k > 2, a problem that has interesting applications in areas such as the multiple alignment of sequences in molecular biology.

In this paper, we describe and analyse two algorithms with particular reference to the 3-LCS problem, though each algorithm can be extended to solve the k-LCS problem for general k. The first algorithm, which can be viewed as a “lazy” version of dynamic programming, has time and space complexity that is O(n(n−1) ²) for 3 strings, and O(kn(n−1) ^k}-1) for k strings, where n is the common length of the strings and l is the length of an LCS. The second algorithm, which involves evaluating entries in a “threshold” table in diagonal order, has time and space complexity that is O(l(n−1)²+sn) for 3 strings, and O(kl(n−1) ^k −1+ksn) for k strings, where s is the alphabet size. For simplicity, the algorithms are presented for equal-length strings, though extension to unequal-length strings is straightforward.

Empirical evidence is presented to show that both algorithms show significant improvement on the basic dynamic programming approach, and on an earlier algorithm proposed by Hsu and Du, particularly, as would be expected, in the case where l is relatively large, with the balance of evidence being heavily in favour of the threshold approach.

Supported by a postgraduate research studentship from the Science and Engineering Research Council

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

A. Apostolico. Improving the worst-case performance of the Hunt-Szymanski strategy for the longest common subsequence of two strings. Information Processing Letters, 23:63–69, 1986.
Google Scholar
A. Apostolico, S. Browne, and C. Guerra. Fast linear-space computations of longest common subsequences. Theoretical Computer Science, 92:3–17, 1992.
Google Scholar
A. Apostolico and C. Guerra. The longest common subsequence problem revisited. Algorithmica, 2:315–336, 1987.
Google Scholar
D.S. Hirschberg. A linear space algorithm for computing maximal common subsequences. Communications of the A.C.M., 18:341–343, 1975.
Google Scholar
D.S. Hirschberg. Algorithms for the longest common subsequence problem. Journal of the A.C.M., 24:664–675, 1977.
Google Scholar
W.J. Hsu and M.W. Du. Computing a longest common subsequence for a set of strings. BIT, 24:45–59, 1984.
Google Scholar
J.W. Hunt and T.G. Szymanski. A fast algorithm for computing longest common subsequences. Communications of the A.C.M., 20:350–353, 1977.
Google Scholar
S.Y. Itoga. The string merging problem. BIT, 21:20–30, 1981.
Google Scholar
W.J. Masek and M.S. Paterson. A faster algorithm for computing string editing distances. J. Comput. System Sci., 20:18–31, 1980.
Google Scholar
E.W. Myers. An O(ND) difference algorithm and its variations. Algorithmica, 1:251–266, 1986.
Google Scholar
N. Nakatsu, Y. Kambayashi, and S. Yajima. A longest common subsequence algorithm suitable for similar text strings. Acta Informatica, 18:171–179, 1982.
Google Scholar
D. Sankoff. Matching sequences under deletion insertion constraints. Proc. Nat. Acad. Sci. U.S.A., 69:4–6, 1972.
Google Scholar
E. Ukkonen. Algorithms for approximate string matching. Information and Control, 64:100–118, 1985.
Google Scholar
R.A. Wagner and M.J. Fischer. The string-to-string correction problem. Journal of the A.C.M., 21:168–173, 1974.
Google Scholar
S. Wu, U. Manber, G. Myers, and W. Miller. An O(NP) sequence comparison algorithm. Information Processing Letters, 35:317–323, 1990.
Google Scholar

Download references

Author information

Authors and Affiliations

Computing Science Department, University of Glasgow, Glasgow, Scotland
Robert W. Irving & Campbell B. Fraser

Authors

Robert W. Irving
View author publications
You can also search for this author in PubMed Google Scholar
Campbell B. Fraser
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Alberto Apostolico Maxime Crochemore Zvi Galil Udi Manber

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Irving, R.W., Fraser, C.B. (1992). Two algorithms for the longest common subsequence of three (or more) strings. In: Apostolico, A., Crochemore, M., Galil, Z., Manber, U. (eds) Combinatorial Pattern Matching. CPM 1992. Lecture Notes in Computer Science, vol 644. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-56024-6_18

Download citation

DOI: https://doi.org/10.1007/3-540-56024-6_18
Published: 04 June 2005
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-56024-1
Online ISBN: 978-3-540-47357-2
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics