Skip to main content

Two algorithms for the longest common subsequence of three (or more) strings

  • Conference paper
  • First Online:
Combinatorial Pattern Matching (CPM 1992)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 644))

Included in the following conference series:

Abstract

Various algorithms have been proposed, over the years, for the longest common subsequence problem on 2 strings (2-LCS), many of these improving, at least for some cases, on the classical dynamic programming approach. However, relatively little attention has been paid in the literature to the k-LCS problem for k > 2, a problem that has interesting applications in areas such as the multiple alignment of sequences in molecular biology.

In this paper, we describe and analyse two algorithms with particular reference to the 3-LCS problem, though each algorithm can be extended to solve the k-LCS problem for general k. The first algorithm, which can be viewed as a “lazy” version of dynamic programming, has time and space complexity that is O(n(n−1) 2) for 3 strings, and O(kn(n−1) k}-1) for k strings, where n is the common length of the strings and l is the length of an LCS. The second algorithm, which involves evaluating entries in a “threshold” table in diagonal order, has time and space complexity that is O(l(n−1)2+sn) for 3 strings, and O(kl(n−1) k −1+ksn) for k strings, where s is the alphabet size. For simplicity, the algorithms are presented for equal-length strings, though extension to unequal-length strings is straightforward.

Empirical evidence is presented to show that both algorithms show significant improvement on the basic dynamic programming approach, and on an earlier algorithm proposed by Hsu and Du, particularly, as would be expected, in the case where l is relatively large, with the balance of evidence being heavily in favour of the threshold approach.

Supported by a postgraduate research studentship from the Science and Engineering Research Council

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. A. Apostolico. Improving the worst-case performance of the Hunt-Szymanski strategy for the longest common subsequence of two strings. Information Processing Letters, 23:63–69, 1986.

    Google Scholar 

  2. A. Apostolico, S. Browne, and C. Guerra. Fast linear-space computations of longest common subsequences. Theoretical Computer Science, 92:3–17, 1992.

    Google Scholar 

  3. A. Apostolico and C. Guerra. The longest common subsequence problem revisited. Algorithmica, 2:315–336, 1987.

    Google Scholar 

  4. D.S. Hirschberg. A linear space algorithm for computing maximal common subsequences. Communications of the A.C.M., 18:341–343, 1975.

    Google Scholar 

  5. D.S. Hirschberg. Algorithms for the longest common subsequence problem. Journal of the A.C.M., 24:664–675, 1977.

    Google Scholar 

  6. W.J. Hsu and M.W. Du. Computing a longest common subsequence for a set of strings. BIT, 24:45–59, 1984.

    Google Scholar 

  7. J.W. Hunt and T.G. Szymanski. A fast algorithm for computing longest common subsequences. Communications of the A.C.M., 20:350–353, 1977.

    Google Scholar 

  8. S.Y. Itoga. The string merging problem. BIT, 21:20–30, 1981.

    Google Scholar 

  9. W.J. Masek and M.S. Paterson. A faster algorithm for computing string editing distances. J. Comput. System Sci., 20:18–31, 1980.

    Google Scholar 

  10. E.W. Myers. An O(ND) difference algorithm and its variations. Algorithmica, 1:251–266, 1986.

    Google Scholar 

  11. N. Nakatsu, Y. Kambayashi, and S. Yajima. A longest common subsequence algorithm suitable for similar text strings. Acta Informatica, 18:171–179, 1982.

    Google Scholar 

  12. D. Sankoff. Matching sequences under deletion insertion constraints. Proc. Nat. Acad. Sci. U.S.A., 69:4–6, 1972.

    Google Scholar 

  13. E. Ukkonen. Algorithms for approximate string matching. Information and Control, 64:100–118, 1985.

    Google Scholar 

  14. R.A. Wagner and M.J. Fischer. The string-to-string correction problem. Journal of the A.C.M., 21:168–173, 1974.

    Google Scholar 

  15. S. Wu, U. Manber, G. Myers, and W. Miller. An O(NP) sequence comparison algorithm. Information Processing Letters, 35:317–323, 1990.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Alberto Apostolico Maxime Crochemore Zvi Galil Udi Manber

Rights and permissions

Reprints and permissions

Copyright information

© 1992 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Irving, R.W., Fraser, C.B. (1992). Two algorithms for the longest common subsequence of three (or more) strings. In: Apostolico, A., Crochemore, M., Galil, Z., Manber, U. (eds) Combinatorial Pattern Matching. CPM 1992. Lecture Notes in Computer Science, vol 644. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-56024-6_18

Download citation

  • DOI: https://doi.org/10.1007/3-540-56024-6_18

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-56024-1

  • Online ISBN: 978-3-540-47357-2

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics