Abstract
CLOSEST STRING is one of the core problems in the field of consensus word analysis with particular importance for computational biology. Given k strings of same length and a positive integer d, find a “closest string” s such that none of the given strings has Hamming distance greater than d from s. Closest String is NP-complete. We show how to solve CLOSEST STRING in linear time for constant d (the exponential growth is O(d d. We extend this result to the closely related problems d-MISMATCH and DISTINGUISHING STRING SELECTION. Moreover, we discuss fixed parameter tractability for parameter k and give an efficient linear time algorithm for CLOSEST STRING when k = 3. Finally, the practical usefulness of our findings is substantiated by some experimental results.
Supported by the Deutsche Forschungsgemeinschaft (DFG), project OPAL (optimal solutions for hard problems in computational biology), NI 369/2-1.
Frances and Litman [4] show the NP-completeness of Closest String, considering it from the viewpoint of coding theory (so-called Minimum Radius problem).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
J. Alber, J. Gramm, and R. Niedermeier. Faster exact solutions for hard problems: a parameterized point of view. Discrete Mathematics, 229(1–3):3–27, 2001.
R. G. Downey and M. R. Fellows. Parameterized Complexity. Springer. 1999.
P. A. Evans and H. T. Wareham. Practical non-polynomial time algorithms for designing universal DNA oligonucleotides: a systematic approach. Manuscript, April 2001.
M. Frances and A. Litman. On covering problems of codes. Theory of Computing Systems, 30:113–119, 1997.
R. Kannan. Minkowski’s convex body theorem and integer programming. Mathematics of Operations Research, 12:415–440, 1987.
J. K. Lanctot, M. Li, B. Ma, S. Wang, and L. Zhang. Distinguishing string selection problems. In Proc. of 10th ACM-SIAM SODA, pages 633–642, 1999, ACM Press. To appear in Information and Computation.
J. C. Lagarias. Point lattices. In R. L. Graham et al. (eds.) Handbook of Combinatorics, pages 919–966. MIT Press, 1995.
H.W. Lenstra. Integer programming with a fixed number of variables. Mathematics of Operations Research, 8:538–548, 1983.
M. Li, B. Ma, and L. Wang. Finding similar regions in many strings. In Proc. of 31st ACM STOC, pages 473–482, 1999. ACM Press.
P. A. Pevzner. Computational Molecular Biology: An Algorithmic Approach. MIT Press, 2000.
N. Stojanovic, P. Berman, D. Gumucio, R. Hardison, and W. Miller. A linear-time algorithm for the 1-mismatch problem. In Proc. of 5th WADS, number 1272 in LNCS, pages 126–135, 1997, Springer.
N. Stojanovic, L. Florea, C. Riemer, D. Gumucio, J. Slightom, M. Goodman, W. Miller, and R. Hardison. Comparison of five methods for finding conserved sequences in multiple alignments of gene regulatory regions. Nucleic Acids Research, 27(19):3899–3910, 1999.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2001 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Gramm, J., Niedermeier, R., Rossmanith, P. (2001). Exact Solutions for Closest String and Related Problems. In: Eades, P., Takaoka, T. (eds) Algorithms and Computation. ISAAC 2001. Lecture Notes in Computer Science, vol 2223. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45678-3_38
Download citation
DOI: https://doi.org/10.1007/3-540-45678-3_38
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-42985-2
Online ISBN: 978-3-540-45678-0
eBook Packages: Springer Book Archive