Abstract
The longest common subsequence (LCS) problem is a classic and well-studied problem in computer science with extensive applications in diverse areas ranging from spelling error corrections to molecular biology. This paper focuses on LCS for fixed alphabet size and fixed run-lengths (i.e., maximum number of consecutive occurrences of the same symbol). We show that LCS is NP-complete even when restricted to (i) alphabets of size 3 and run-length at most 1, and (ii) alphabets of size 2 and run-length at most 2 (both results are tight). For the latter case, we show that the problem is approximable within ratio 3/5.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Ann, H.-Y., Yang, C.-B., Tseng, C.-T., Hor, C.-Y.: Fast algorithms for computing the constrained lcs of run-length encoded strings. In: Arabnia, H.R., Yang, M.Q. (eds.) Proc. International Conference on Bioinformatics & Computational Biology (BIOCOMP), Las Vegas, USA, pp. 646–649. CSREA Press (2009)
Ann, H.-Y., Yang, C.-B., Tseng, C.-T., Hor, C.-Y.: A fast and simple algorithm for computing the longest common subsequence of run-length encoded strings. Information Processing Letters 108, 360–364 (2008)
Apostolico, A., Landau, G.M., Skiena, S.: Matching for run-length encoded strings. Journal of Complexity 15(1), 4–16 (1999)
Bergroth, L., Hakonen, H., Raita, T.: A survey of longest common subsequence algorithms. In: Proc. of the 7th International Symposium on String Processing Information Retrieval (SPIRE), Coru\(\tilde{\text{n}}\)a, Spain, pp. 39–48. IEEE Computer Society (2000)
Berman, P., Schnitger, G.: On the complexity of approximating the independent set problem. Information and Computation 96, 77–94 (1992)
Bodlaender, H.L., Downey, R.G., Fellows, M.R., Hallett, M.T., Wareham, H.T.: Parameterized complexity analysis in computational biology. Computer Applications in the Biosciences 11(1), 49–57 (1995)
Bodlaender, H.L., Downey, R.G., Fellows, M.R., Wareham, H.T.: The parameterized complexity of sequence alignment and consensus. Theoretical Computer Science 147, 31–54 (1994)
Bonizzoni, P., Della Vedova, G., Mauri, G.: Experimenting an approximation algorithm for the lcs. Discrete Applied Mathematics 110(1), 13–24 (2001)
Bunke, H., Csirik, J.: An improved algorithm for computing the edit distance of run-length coded strings. Information Processing Letters 54, 93–96 (1995)
Crochemore, M., Hancart, C., Lecroq, T.: Algorithms on Strings, Cambridge (2007)
Freschi, V., Bogliolo, A.: Longest common subsequence between run-length-encoded strings: a new algorithm with improved parallelism. Information Processing Letters 90, 167–173 (2004)
Halldórsson, M.M.: Approximation via partitioning. Technical report, School of Information Science, Japan Advanced Institute of Science and Technology, Hokuriku (1995)
Hsu, P.-H., Chen, K.-Y., Chao, K.-M.: Finding All Approximate Gapped Palindromes. In: Dong, Y., Du, D.-Z., Ibarra, O. (eds.) ISAAC 2009. LNCS, vol. 5878, pp. 1084–1093. Springer, Heidelberg (2009)
Huang, G.S., Liu, J.J., Wang, Y.L.: Sequence Alignment Algorithms for Run-Length-Encoded Strings. In: Hu, X., Wang, J. (eds.) COCOON 2008. LNCS, vol. 5092, pp. 319–330. Springer, Heidelberg (2008)
Jiang, T., Li, M.: On the approximation of shortest common supersequences and longest common subsequences. SIAM Journal on Computing 24, 1122–1139 (1995)
Hsu, P.-H., Chen, K.-Y., Chao, K.-M.: Finding All Approximate Gapped Palindromes. In: Dong, Y., Du, D.-Z., Ibarra, O. (eds.) ISAAC 2009. LNCS, vol. 5878, pp. 1084–1093. Springer, Heidelberg (2009)
Kim, J.W., Amir, A., Landau, G.M., Park, K.: Computing similarity of run-length encoded strings with affine gap penalty. Theoretical Computer Science 395, 268–282 (2008)
Liu, J.J., Huang, G.S., Wang, Y.L.: A fast algorithm for finding the positions of all squares in a run-length encoded string. Theoretical Computer Science 410, 3942–3948 (2009)
Liu, J.J., Huang, G.S., Wang, Y.L., Lee, R.C.T.: Edit distance for a run-length-encoded string and an uncompressed string. Information Processing Letters 105, 12–16 (2007)
Liu, J.J., Wang, Y.L., Lee, R.C.T.: Finding a longest common subsequence between a run-length-encoded string and an uncompressed string. Journal of Complexity 24, 173–184 (2008)
Maier, D.: The complexity of some problems on subsequences and supersequences. Journal of the ACM 25(2), 322–336 (1978)
Matsubara, W., Inenaga, S., Ishino, A., Shinohara, A., Nakamura, T., Hashimoto, K.: Efficient algorithms to compute compressed longest common substrings and compressed palindromes. Theoretical Computer Science 410, 900–913 (2009)
Mitchell, J.S.B.: A geometric shortest path problem, with application to computing a longest common subsequence in run-length encoded strings. Technical report, Department of Applied Mathematics, SUNY Stony Brook (1997)
Pietrzak, K.: On the parameterized complexity of the fixed alphabet shortest common supersequence and longest common subsequence problems. J. of Computer and System Sciences 67(4), 757–771 (2003); Special issue on Parameterized computation and complexity
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Blin, G., Bulteau, L., Jiang, M., Tejada, P.J., Vialette, S. (2012). Hardness of Longest Common Subsequence for Sequences with Bounded Run-Lengths. In: Kärkkäinen, J., Stoye, J. (eds) Combinatorial Pattern Matching. CPM 2012. Lecture Notes in Computer Science, vol 7354. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-31265-6_11
Download citation
DOI: https://doi.org/10.1007/978-3-642-31265-6_11
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-31264-9
Online ISBN: 978-3-642-31265-6
eBook Packages: Computer ScienceComputer Science (R0)