Closest Substring Problem – Results from an Evolutionary Algorithm

  • Holger Mauch
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3316)


The closest substring problem is a formal description of how to find a pattern such that from a given set of strings a subregion of each string is highly similar to that pattern. This problem appears frequently in computational biology and in coding theory. Experimental results suggest that this NP-hard optimization problem can be approached very well with a custom-built evolutionary algorithm using a fixed-length string representation, as in the typical genetic algorithm (GA) concept. Part of this success can be attributed to a novel mutation operator introduced in this paper. For practical purposes, the GA used here seems to be an improvement compared to traditional approximation algorithms. While the time complexity of traditional approximation algorithms can be analyzed precisely, they suffer from poor run-time efficiency or poor accuracy, or both.


Genetic Algorithm Closest String Problem Closest Substring Problem Radius of Code 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Mauch, H., Melzer, M.J., Hu, J.S.: Genetic algorithm approach for the closest string problem. In: Proceedings of the 2003 IEEE Bioinformatics Conference (CSB 2003), Stanford, California, August 11–14, 2003, pp. 560–561. IEEE Computer Society Press, Los Alamitos (2003)CrossRefGoogle Scholar
  2. 2.
    Frances, M., Litman, A.: On covering problems of codes. Theory of Computing Systems 30, 113–119 (1997)zbMATHMathSciNetGoogle Scholar
  3. 3.
    Gąsieniec, L., Jansson, J., Lingas, A.: Approximation algorithms for hamming clustering problems. In: Giancarlo, R., Sankoff, D. (eds.) CPM 2000. LNCS, vol. 1848, pp. 108–118. Springer, Heidelberg (2000)CrossRefGoogle Scholar
  4. 4.
    Gramm, J., Niedermeier, R., Rossmanith, P.: Exact solutions for CLOSEST STRING and related problems. In: Eades, P., Takaoka, T. (eds.) ISAAC 2001. LNCS, vol. 2223, pp. 441–453. Springer, Heidelberg (2001)CrossRefGoogle Scholar
  5. 5.
    Lanctot, J.K., Li, M., Ma, B., Wang, S., Zhang, L.: Distinguishing string selection problems. Information and Computation 185, 41–55 (2003)zbMATHCrossRefMathSciNetGoogle Scholar
  6. 6.
    Li, M., Ma, B., Wang, L.: On the closest string and substring problems. Journal of the ACM 49, 157–171 (2002)CrossRefMathSciNetGoogle Scholar
  7. 7.
    Bäck, T., Fogel, D.B., Michalewicz, Z. (eds.): Evolutionary Computation 1 – Basic Algorithms and Operators. Institute of Physics Publishing, Bristol, UK (2000)Google Scholar
  8. 8.
    Goldberg, D.: Genetic Algorithms in Search, Optimization, and Machine Learning. Addison-Wesley, Reading (1989)zbMATHGoogle Scholar
  9. 9.
    Holland, J.H.: Adaptation in Natural and Artificial Systems. University of Michigan Press, Ann Arbor (1975)Google Scholar
  10. 10.
    Banzhaf, W., Nordin, P., Keller, R.E., Francone, F.D.: Genetic Programming - An Introduction: On the Automatic Evolution of Computer Programs and its Applications. Morgan Kaufmann Publishers, Inc., San Francisco (1998)Google Scholar
  11. 11.
    Gen, M., Cheng, R.: Genetic Algorithms and Engineering Design. John Wiley and Sons, Inc., New York (1996)Google Scholar
  12. 12.
    Ono, I., Yamamura, M., Kobayashi, S.: A genetic algorithm for job-shop scheduling problems using job-based order crossover. In: Proceedings of IEEE International Conference on Evolutionary Computation (ICEC 1996), pp. 547–552 (1996)Google Scholar
  13. 13.
    Tavares, J., Pereira, F.B., Costa, E.: Evolving golomb rulers. In: Deb, K., et al. (eds.) GECCO 2004. LNCS, vol. 3103, pp. 416–417. Springer, Heidelberg (2004)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2004

Authors and Affiliations

  • Holger Mauch
    • 1
  1. 1.Dept. of Information and Computer ScienceUniversity of Hawaii at ManoaHonolulu

Personalised recommendations