Abstract
Given a set S={S 1,S 2,…,S l } of l strings, a text T, and a natural number k, find a string M, which is a concatenation of k strings (not necessarily distinct, i.e., a string in S may occur more than once in M) from S, whose longest common subsequence with T is largest, where a string in S may occur more than once in M. Such a string is called a k-inlay. The resequencing longest common subsequence problem (resequencing LCS problem for short) is to find a k-inlay for each query with parameter k after T and S are given. In this paper, we propose an algorithm for solving this problem which takes O(nml) preprocessing time and O(ϑ k k) query time for each query with parameter k, where n is the length of T, m is the maximal length of strings in S, and ϑ k is the length of the longest common subsequence between a k-inlay and T.
Similar content being viewed by others
References
Aggarwal, A., Klawe, M.M., Moran, S., Shor, P., Wilber, R.: Geometric applications of a matrix-searching algorithm. Algorithmica 2(1), 195–208 (1987)
Aho, A., Hirschberg, D., Ullman, J.: Bounds on the complexity of the longest common subsequence problem. J. ACM 23(1), 1–12 (1976)
Alves, C.E.R., Cáceres, E.N., Song, S.W.: An all-substrings common subsequence algorithm. Discrete Appl. Math. 156(7), 1025–1035 (2008)
Amir, A., Hartman, T., Kapah, O., Shalom, R., Tsur, D.: Generalized LCS. Theor. Comput. Sci. 409(3), 438–449 (2008)
Amir, A., Gothilf, T., Shalom, R.: Weighted LCS. In: Proceedings of Combinatorial Algorithms: 20th International Workshop, IWOCA 2009, pp. 36–47 (2009)
Bein, W.W., Golin, M.J., Larmore, L.L., Zhang, Y.: The Knuth-Yao quadrangle-inequality speedup is a consequence of total-monotonicity. In: Proceedings of the 7th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA 2006), pp. 31–40 (2006)
Bergroth, L., Hakonen, H., Raita, T.: A survey of longest common subsequence algorithms. In: Proceedings of 7th Symposium on String Processing and Information Retrieval (SPIRE 2000), pp. 39–48 (2000)
Brent, R.P.: The parallel evaluation of general arithmetic expressions. J. ACM 21, 201–206 (1974)
Burkard, R.E.: Monge properties, discrete convexity and applications. Eur. J. Oper. Res. 176(1), 1–14 (2007)
Burkard, R.E., Klinz, B., Rudolf, R.: Perspectives of Monge properties in optimization. Discrete Appl. Math. 70(2), 95–161 (1996)
Chvatal, V., Klarner, D.A., Knuth, D.E.: Selected combinatorial research problem. Technical Report CS-TR-72-292, Stanford University (1972)
Demaine, E.D., Mozes, S., Rossman, B., Weimann, O.: An optimal decomposition algorithm for tree edit distance. In: Proc. 34th International Colloquium on Automata, Languages and Programming (ICALP). Lecture Notes in Computer Science, vol. 4596, pp. 146–157 (2007)
Gusfield, D.: Algorithms on Strings, Trees and Sequences: Computer Science and Computational Biology. Cambridge University Press, Cambridge (1997)
Hirschberg, D.S.: Algorithms for the longest common subsequence problem. J. ACM 24(4), 664–675 (1977)
Huang, K.S., Yang, C.B., Tseng, K.T., Peng, Y.H., Ann, H.Y.: Dynamic programming algorithms for the mosaic longest common subsequence problem. Inf. Process. Lett. 102, 99–103 (2007)
Knuth, D.E.: The Art of Computer Programming, pp. 560–563. Addison-Wesley, Reading (1973)
Komatsoulis, G.A., Waterman, M.S.: Chimeric alignment by dynamic programming: algorithm and biological uses. In: RECOMB97: Proceedings of the First Annual International Conference on Computational Molecular Biology, pp. 174–180. ACM Press, New York (1997)
Komatsoulis, G.A., Waterman, M.S.: A new computational method for detection of chimeric 16S rRNA artifacts generated by PCR amplification from mixed bacterial populations. Appl. Environ. Microbiol. 63(6), 2338–2346 (1997)
Landau, G.M., Ziv-Ukelson, M.: On the common substring alignment problem. J. Algorithms 41(2), 338–359 (2001)
Liu, J.J., Wang, Y.L., Lee, R.C.T.: Finding a longest common subsequence between a run-length-encoded string and an uncompressed string. J. Complex. 24, 173–184 (2008)
Masek, W.J., Paterson, M.S.: A faster algorithm computing string edit distances. J. Comput. Syst. Sci. 20, 18–31 (1980)
Modelevsky, J.: Computer applications in applied genetic engineering. Adv. Appl. Microbiol. 30, 169–195 (1984)
Wagner, R.A., Fischer, M.J.: The string-to-string correction problem. J. ACM 21(1), 168–173 (1974)
Zhang, K., Shasha, D.: Simple fast algorithms for the editing distance between trees and related problems. SIAM J. Comput. 18(6), 1245–1262 (1989)
Acknowledgements
The authors would like to thank anonymous referees for their careful reading with corrections and useful comments which helped to improve the paper.
Author information
Authors and Affiliations
Corresponding author
Additional information
This work was supported in part by the National Science Council of the Republic of China under contracts NSC 100-2221-E-011-067-MY3 and NSC 101-2221-E-011-038-MY3.
Rights and permissions
About this article
Cite this article
Kuo, CE., Wang, YL., Liu, JJ. et al. Resequencing a Set of Strings Based on a Target String. Algorithmica 72, 430–449 (2015). https://doi.org/10.1007/s00453-013-9859-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00453-013-9859-z