New Refinement Techniques for Longest Common Subsequence Algorithms
- 413 Downloads
Certain properties of the input strings have dominating influence on the running time of an algorithm selected to solve the longest common subsequence (lcs) problem of two input strings. It has turned out to be difficult – as well theoretically as practically – to develop an lcs algorithm which would be superior for all problem instances. Furthermore, implementing the most evolved lcs algorithms presented recently is laborious.
This paper shows that it is still beneficial to refine the traditional lcs algorithms to get new algorithm variants that are in practice competitive to the modern lcs methods in certain problem instances. We present and analyse a general-purpose algorithm NKY-MODIF, which has a moderate time and space efficiency and can easily be implemented correctly. The algorithm bases on the so-called diagonal-wise method of Nakatsu, Kambayashi and Yajima (NKY). The NKY algorithm was selected for our further consideration due to its algorithmic independence of the size of the input alphabet and its light pre-processing phase.
The NKY-MODIF algorithm refines the NKY method essentially in three ways: by reducing unnecessary scanning over the input sequences, storing the intermediate results more locally, and utilizing lower and upper bound knowledge about the lcs. In order to demonstrate that the some of the presented ideas are not specific for the NKY only, we apply lower bound information on two lcs algorithms having a different processing approach than the NKY has. This introduces a new way to solve the lcs problem.
The lcs problem has two variants: calculating only the length of the lcs, and determining also the symbols belonging to one instance of the lcs. We verify the presented ideas for both of these problem types by extensive test runs.
Keywordslongest common subsequence string algorithms heuristic algorithms
Unable to display preview. Download preview PDF.
- 5.Bergroth, L., Hakonen, H., Raita, T.: A Survey of Longest Common Subsequence Algorithms. In: Proceedings of SPIRE 2000, A Coruña, Spain, pp. 39–47 (2000)Google Scholar
- 9.Rick, C.: New Algorithms for the Longest Common Subsequence Problem, Institut für Informatik der Universität Bonn, Research Report No. 85123-Cs (October 1994)Google Scholar
- 15.Bergroth, L., Hakonen, H., Raita, T.: New Approximation Algorithms for Longest Common Subsequences. In: Proceedings of SPIRE 1998, Santa Cruz de la Sierra, Bolivia (September 1998)Google Scholar
- 16.Johtela, T., Smed, J., Hakonen, H., Raita, T.: An Efficient Heuristic for the LCS Problem. In: Third South American Workshop on String Processing, WSP 1996, Recife, Brazil, August 1996, pp. 126–140 (1996)Google Scholar
- 19.Goeman, H., Clausen, M.: A New Practical Linear Space Algorithm for the Longest Common Subsequence Problem. In: Proceedings of the Prague Stringology Club Workshop 1999 (1999)Google Scholar