Advertisement

Algorithmica

pp 1–32 | Cite as

Consensus Strings with Small Maximum Distance and Small Distance Sum

  • Laurent Bulteau
  • Markus L. SchmidEmail author
Article
  • 14 Downloads

Abstract

The parameterised complexity of various consensus string problems (Closest String, Closest Substring, Closest String with Outliers) is investigated in a more general setting, i. e., with a bound on the maximum Hamming distance and a bound on the sum of Hamming distances between solution and input strings. We completely settle the parameterised complexity of these generalised variants of Closest String and Closest Substring, and partly for Closest String with Outliers; in addition, we answer some open questions from the literature regarding the classical problem variants with only one distance bound. Finally, we investigate the question of polynomial kernels and respective lower bounds.

Keywords

Consensus String Problems Closest String Closest Substring Parameterised Complexity Kernelisation 

Notes

Acknowledgements

We wish to thank the anonymous referees for valuable feedback that improved the readability of this paper.

References

  1. 1.
    Amir, A., Landau, G.M., Na, J.C., Park, H., Park, K., Sim, J.S.: Efficient algorithms for consensus string problems minimizing both distance sum and radius. Theor. Comput. Sci. 412, 5239–5246 (2011)MathSciNetCrossRefGoogle Scholar
  2. 2.
    Basavaraju, M., Panolan, F., Rai, A., Ramanujan, M.S., Saurabh, S.: On the kernelization complexity of string problems. Theor. Comput. Sci. 730, 21–31 (2018)MathSciNetCrossRefGoogle Scholar
  3. 3.
    Ben-Dor, A., Lancia, G., Ravi, R., Perone, J.: Banishing bias from consensus sequences. In: Proc. 8th Annual Symposium on Combinatorial Pattern Matching, CPM 1997, LNCS, 1264, pp. 247–261 (1997)Google Scholar
  4. 4.
    Bodlaender, H.L., Thomassé, S., Yeo, A.: Kernel bounds for disjoint cycles and disjoint paths. Theor. Comput. Sci. 412(35), 4570–4578 (2011)MathSciNetCrossRefGoogle Scholar
  5. 5.
    Bodlaender, H.L., Jansen, B.M.P., Kratsch, S.: Kernelization lower bounds by cross-composition. SIAM J. Discrete Math. 28(1), 277–305 (2014)MathSciNetCrossRefGoogle Scholar
  6. 6.
    Boucher, C., Ma, B.: Closest string with outliers. BMC Bioinformatics 12, S55 (2011)CrossRefGoogle Scholar
  7. 7.
    Bulteau, L., Hüffner, F., Komusiewicz, C., Niedermeier, R.: Multivariate algorithmics for NP-hard string problems. Bull. EATCS 114, 31–73 (2014)zbMATHGoogle Scholar
  8. 8.
    Chen, J., Hermelin, D., Sorge, M.: On computing centroids according to the p-norms of hamming distance vectors. In: 27th Annual European Symposium on Algorithms, ESA 2019, September 9–11, 2019, Munich/Garching, Germany, pp. 28:1–28:16 (2019)Google Scholar
  9. 9.
    Cygan, M., Fomin, F., Kowalik, L., Lokshtanov, D., Marx, D., Pilipczuk, M., Pilipczuk, M., Saurabh, S.: Parameterized Algorithms. Springer, New York (2015) CrossRefGoogle Scholar
  10. 10.
    Deng, X., Li, G., Li, Z., Ma, B., Wang, L.: Genetic design of drugs without side-effects. SIAM J. Comput. 32(4), 1073–1090 (2003)MathSciNetCrossRefGoogle Scholar
  11. 11.
    Dopazo, J., Rodríguez, A., Sáiz, J., Sobrino, F.: Design of primers for PCR amplification of highly variable genomes. Comput. Appl. Biosci. 9(2), 123–125 (1993)Google Scholar
  12. 12.
    Downey, R.G., Fellows, M.R.: Parameterized Complexity. Springer, New York (2012)zbMATHGoogle Scholar
  13. 13.
    Downey, R.G., Fellows, M.R.: Fundamentals of Parameterized Complexity. Texts in Computer Science. Springer, London (2013)CrossRefGoogle Scholar
  14. 14.
    Evans, P.A., Smith, A., Wareham, H.T.: The parameterized complexity of p-center approximate substring problems. Technical Report TR01-149, Faculty of Computer Science, University of New Brunswick, Canada (2001)Google Scholar
  15. 15.
    Evans, P.A., Smith, A.D., Wareham, H.T.: On the complexity of finding common approximate substrings. Theor. Comput. Sci. 306, 407–430 (2003)MathSciNetCrossRefGoogle Scholar
  16. 16.
    Fellows, M.R., Gramm, J., Niedermeier, R.: On the parameterized intractability of motif search problems. Combinatorica 26, 141–167 (2006)MathSciNetCrossRefGoogle Scholar
  17. 17.
    Flum, J., Grohe, M.: Parameterized Complexity Theory. Springer, Berlin (2006)zbMATHGoogle Scholar
  18. 18.
    Fomin, F.V., Lokshtanov, D., Saurabh, S., Zehavi, M.: Kernelization: Theory of Parameterized Preprocessing. Cambridge University Press, Cambridge (2019)zbMATHGoogle Scholar
  19. 19.
    Frances, M., Litman, A.: On covering problems of codes. Theory Comput. Syst. 30, 113–119 (1997)MathSciNetCrossRefGoogle Scholar
  20. 20.
    Gramm, J., Niedermeier, R., Rossmanith, P.: Fixed-parameter algorithms for closest string and related problems. Algorithmica 37, 25–42 (2003)MathSciNetCrossRefGoogle Scholar
  21. 21.
    Lanctot, J.K., Li, M., Ma, B., Wang, S., Zhang, L.: Distinguishing string selection problems. Inf. Comput. 185, 41–55 (2003)MathSciNetCrossRefGoogle Scholar
  22. 22.
    Lenstra, H.W.: Integer programming with a fixed number of variables. Math. Oper. Res. 8(4), 538–548 (1983)MathSciNetCrossRefGoogle Scholar
  23. 23.
    Li, M., Ma, B., Wang, L.: Finding similar regions in many sequences. J. Comput. Syst. Sci. 65(1), 73–96 (2002).  https://doi.org/10.1006/jcss.2002.1823 MathSciNetCrossRefzbMATHGoogle Scholar
  24. 24.
    Lucas, K., Busch, M., Mössinger, S., Thompson, J.A.: An improved microcomputer program for finding gene- or gene family-specific oligonucleotides suitable as primers for polymerase chain reactions or as probes. Comput. Appl. Biosci. 7(4), 525–529 (1991)Google Scholar
  25. 25.
    Marx, D.: Closest substring problems with small distances. SIAM J. Comput. 38, 1382–1410 (2008)MathSciNetCrossRefGoogle Scholar
  26. 26.
    Pavesi, G., Mauri, G., Pesole, G.: An algorithm for finding signals of unknown length in DNA sequences. Bioinformatics 17, S207–S214 (2001)CrossRefGoogle Scholar
  27. 27.
    Pevzner, P., Sze, S.: Combinatorial approaches to finding subtle signals in DNA strings. In: Proceedings of the 8th International Conference on Intelligent Systems for Molecular Biology, ISMB 2000, pp. 269–278 (2000)Google Scholar
  28. 28.
    Proutski, V., Holmes, E.C.: Primer master: a new program for the design and analysis of PCR primers. Comput. Appl. Biosci. 12(3), 253–255 (1996)Google Scholar
  29. 29.
    Schmid, M.L.: Finding consensus strings with small length difference between input and solution strings. TOCT 9(3), 13:1–13:18 (2017)MathSciNetCrossRefGoogle Scholar
  30. 30.
    Tompa, M., Li, N., Bailey, T.L., Church, G.M., Moor, B.D., Eskin, E., Favorov, A.V., Frith, M.C., Fu, Y., Kent, W.J., Makeev, V.J., Mironov, A.A., Noble, W.S., Pavesi, G., Pesole, G., Régnier, M., Simonis, N., Sinha, S., Thijs, G., van Helden, J., Vandenbogaert, M., Weng, Z., Workman, C., Ye, C., Zhu, Z.: Assessing computational tools for the discovery of transcription factor binding sites. Nat. Biotechnol. 23(1), 137–144 (2005)CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2019

Authors and Affiliations

  1. 1.Université Paris-Est, LIGM (UMR 8049), CNRS, ENPC, ESIEE Paris, UPEMMarne-la-ValléeFrance
  2. 2.Fachbereich 4 – Abteilung Informatikwissenschaften, Universität TrierTrierGermany

Personalised recommendations