Advertisement

New Refinement Techniques for Longest Common Subsequence Algorithms

  • Lasse Bergroth
  • Harri Hakonen
  • Juri Väisänen
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2857)

Abstract

Certain properties of the input strings have dominating influence on the running time of an algorithm selected to solve the longest common subsequence (lcs) problem of two input strings. It has turned out to be difficult – as well theoretically as practically – to develop an lcs algorithm which would be superior for all problem instances. Furthermore, implementing the most evolved lcs algorithms presented recently is laborious.

This paper shows that it is still beneficial to refine the traditional lcs algorithms to get new algorithm variants that are in practice competitive to the modern lcs methods in certain problem instances. We present and analyse a general-purpose algorithm NKY-MODIF, which has a moderate time and space efficiency and can easily be implemented correctly. The algorithm bases on the so-called diagonal-wise method of Nakatsu, Kambayashi and Yajima (NKY). The NKY algorithm was selected for our further consideration due to its algorithmic independence of the size of the input alphabet and its light pre-processing phase.

The NKY-MODIF algorithm refines the NKY method essentially in three ways: by reducing unnecessary scanning over the input sequences, storing the intermediate results more locally, and utilizing lower and upper bound knowledge about the lcs. In order to demonstrate that the some of the presented ideas are not specific for the NKY only, we apply lower bound information on two lcs algorithms having a different processing approach than the NKY has. This introduces a new way to solve the lcs problem.

The lcs problem has two variants: calculating only the length of the lcs, and determining also the symbols belonging to one instance of the lcs. We verify the presented ideas for both of these problem types by extensive test runs.

Keywords

longest common subsequence string algorithms heuristic algorithms 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Wagner, R.A., Fischer, M.J.: The string to string correction problem. Journal of the Association for Computing Machinery 21(1), 168–173 (1974)zbMATHMathSciNetGoogle Scholar
  2. 2.
    Hirschberg, D.S.: Algorithms for the Longest Common Subsequence problem. Journal of the Association for Computing Machinery 24(4), 664–675 (1977)zbMATHMathSciNetGoogle Scholar
  3. 3.
    Hunt, J.W., Szymanski, T.G.: A Fast Algorithm for Computing Longest Common Subsequences. Communications of the ACM 20(5), 350–353 (1977)zbMATHCrossRefMathSciNetGoogle Scholar
  4. 4.
    Mukhopadhyay, A.: A Fast Algorithm for the Longest-Common-Subsequence Problem. Information Sciences 20, 69–82 (1980)zbMATHCrossRefMathSciNetGoogle Scholar
  5. 5.
    Bergroth, L., Hakonen, H., Raita, T.: A Survey of Longest Common Subsequence Algorithms. In: Proceedings of SPIRE 2000, A Coruña, Spain, pp. 39–47 (2000)Google Scholar
  6. 6.
    Chin, F.Y.L., Poon, C.K.: A Fast Algorithm for Computing Longest Common Subsequences of Small Alphabet Size. Journal of Information Processing 13(4), 463–469 (1990)zbMATHGoogle Scholar
  7. 7.
    Hsu, W.J., Du, M.W.: New Algorithms for the LCS Problem. Journal of Computer and System Sciences 29, 133–152 (1984)zbMATHCrossRefMathSciNetGoogle Scholar
  8. 8.
    Apostolico, A., Guerra, C.: The Longest Common Subsequence Problem Revisited. Algorithmica 2, 315–336 (1987)zbMATHCrossRefMathSciNetGoogle Scholar
  9. 9.
    Rick, C.: New Algorithms for the Longest Common Subsequence Problem, Institut für Informatik der Universität Bonn, Research Report No. 85123-Cs (October 1994)Google Scholar
  10. 10.
    Miller, W., Myers, E.W.: A File Comparison Program. Software – Practice and Experience 15(11), 1025–1040 (1985)CrossRefGoogle Scholar
  11. 11.
    Myers, E.W.: An O(ND) Difference Algorithm and Its Variations. Algorithmica 1, 251–266 (1986)zbMATHCrossRefMathSciNetGoogle Scholar
  12. 12.
    Wu, S., Manber, U., Myers, G., Miller, W.: An O(NP) Sequence Comparison Algorithm. Information Processing Letter 35, 317–323 (1990)zbMATHCrossRefMathSciNetGoogle Scholar
  13. 13.
    Nakatsu, N., Kambayashi, Y., Yajima, S.: A Longest Common Subsequence Suitable for Similar Text Strings. Acta Informatica 18, 171–179 (1982)zbMATHCrossRefMathSciNetGoogle Scholar
  14. 14.
    Chin, F., Poon, C.K.: Performance Analysis of Some Simple Heuristics for Longest Common Subsequences. Algorithmica 12, 293–311 (1994)zbMATHCrossRefMathSciNetGoogle Scholar
  15. 15.
    Bergroth, L., Hakonen, H., Raita, T.: New Approximation Algorithms for Longest Common Subsequences. In: Proceedings of SPIRE 1998, Santa Cruz de la Sierra, Bolivia (September 1998)Google Scholar
  16. 16.
    Johtela, T., Smed, J., Hakonen, H., Raita, T.: An Efficient Heuristic for the LCS Problem. In: Third South American Workshop on String Processing, WSP 1996, Recife, Brazil, August 1996, pp. 126–140 (1996)Google Scholar
  17. 17.
    Kuo, S., Cross, G.R.: An Improved Algorithm to Find the Length of the Longest Common Subsequence of Two Strings. ACM SIGIR Forum 23(3-4), 89–99 (1989)CrossRefGoogle Scholar
  18. 18.
    Rick, C.: Simple and Fast Linear Space Computation of Longest Common Subsequences. Information Processing Letters 75(6), 275–281 (2000)zbMATHCrossRefMathSciNetGoogle Scholar
  19. 19.
    Goeman, H., Clausen, M.: A New Practical Linear Space Algorithm for the Longest Common Subsequence Problem. In: Proceedings of the Prague Stringology Club Workshop 1999 (1999)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2003

Authors and Affiliations

  • Lasse Bergroth
    • 1
    • 3
  • Harri Hakonen
    • 2
    • 3
  • Juri Väisänen
    • 2
  1. 1.Department of Information Technology / Programming techniquesTurku UniversitySaloFinland
  2. 2.Department of Information TechnologyTurku UniversityTurkuFinland
  3. 3.TUCS – Turku Centre for Computer Science 

Personalised recommendations