Abstract
Motif discovery is an important problem in bio-informatics that involves the search for approximate matches. Various algorithms have been proposed, including exhaustive searches as well as heuristic searches that involve searching only a subset of all the possible solutions. One such often employed method is the genetic algorithm. A genetic algorithm based approach is employed in MDGA, using a single population. We build on that method using multiple populations, each evolving against different fitness landscapes. Individuals in each population compete for participation in the genetic events of crossover and mutation based on probabilities. Each of these fitness landscapes, is designed to solve a subset of the problem, thus optimizing a particular characteristic. Once evolution in each of these populations has saturated, they are merged according to a global fitness scheme and then evolved. This process continues till the final population also converges. Different options for implementation are also mentioned. We then proceed to compare our results with that of standard methods, on well known datasets and the results obtained are good.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Paul, T.K., Iba, H.: Identification of Weak Motifs in Multiple Biological Sequences using Genetic Algorithm. In: Proceedings of GECCO 2006, Seattle, USA (2006)
Fogel, G.B., Weekes, D.G., Varga, G., Dow, E.R., Harlow, H.B., Onyia, J.E., Su, C.: Discovery of Sequence Motifs Related to Coexpression of Genes using Evolutionary Computation. Nucleic Acids Research 32(13), 3826–3835 (2004)
Che, D., Song, Y., Rasheed, K.: MDGA: Motif Discovery using a Genetic Algorithm. In: Proceedings of GECCO 2005, pp. 447–452 (2005)
Baile, T.L., Elkan, C.: Unsupervised Learning of Multiple Motifs in Biopolymers using Expectation Maximization. Machine Learning 21, 51–80 (1995)
Hertz, G.Z., Stormo, G.D.: Identifying DNA and Protein Patterns with Statistically Significant Alignment Sets of Multiple Sequences. Bioinformatics 15, 563–577 (1999)
Thijs, G., Marchal, K., Lescot, M., Rombauts, S., De Moore, B., Rouze, P., Moreau, Y.: A Gibbs Sampling Method to Detect Over-represented Motifs in the Upstream Regions of Coexpressed Genes. Journal of Computational Biology 9, 447–464 (2002)
Liu, X., Burtlag, D.L., Liu, J.S.: Bioprospector: Discovering Conserved DNA Motifs in Upstream Regulatory Regions of Co-expressed Genes. In: Pacific Symposium on Biocomputing, vol. 6, pp. 127–138 (2001)
Neuwald, A.F., Liu, J.S., Lawrence, C.E.: Gibbs Motif Sampling: Detection of Bacterial Outer Membrane Protein Repeats. Protein Science 4, 1618–1632 (1995)
Roth, F.P., Hughes, J.D., Estep, P.W., Church, G.M.: Finding DNA Regulatory Motifs within Unaligned Noncoding Sequences Clustered by Whole-Genome mRNA Quantization. Nature Biotechnology 16, 939–945 (1998)
Srinivasa, K.G., Sridharan, K., Shenoy, P.D., Venugopal, K.R., Patnaik, L.M.: A Dynamic Migration Model for Self Adaptive Genetic Algorithms. In: Gallagher, M., Hogan, J.P., Maire, F. (eds.) IDEAL 2005. LNCS, vol. 3578, pp. 555–562. Springer, Heidelberg (2005)
Srinivas, M., Patnaik, L.M.: Binomially Distributed Populations for Modelling GAs. In: Proceedings of Fifth International Conference in Genetic Algorithms, pp. 138–143. Morgan Kauffmann Publishers, San Francisco (1993)
Fraenkel lab downloads, http://jura.wi.mit.edu/fraenkel/download/release_v24/fsafiles/
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Venugopal, K.R., Srinivasa, K.G., Patnaik, L.M. (2009). Merge Based Genetic Algorithm for Motif Discovery. In: Soft Computing for Data Mining Applications. Studies in Computational Intelligence, vol 190. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-00193-2_18
Download citation
DOI: https://doi.org/10.1007/978-3-642-00193-2_18
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-00192-5
Online ISBN: 978-3-642-00193-2
eBook Packages: EngineeringEngineering (R0)