Skip to main content

Merge Based Genetic Algorithm for Motif Discovery

  • Chapter
Soft Computing for Data Mining Applications

Part of the book series: Studies in Computational Intelligence ((SCI,volume 190))

  • 873 Accesses

Abstract

Motif discovery is an important problem in bio-informatics that involves the search for approximate matches. Various algorithms have been proposed, including exhaustive searches as well as heuristic searches that involve searching only a subset of all the possible solutions. One such often employed method is the genetic algorithm. A genetic algorithm based approach is employed in MDGA, using a single population. We build on that method using multiple populations, each evolving against different fitness landscapes. Individuals in each population compete for participation in the genetic events of crossover and mutation based on probabilities. Each of these fitness landscapes, is designed to solve a subset of the problem, thus optimizing a particular characteristic. Once evolution in each of these populations has saturated, they are merged according to a global fitness scheme and then evolved. This process continues till the final population also converges. Different options for implementation are also mentioned. We then proceed to compare our results with that of standard methods, on well known datasets and the results obtained are good.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Paul, T.K., Iba, H.: Identification of Weak Motifs in Multiple Biological Sequences using Genetic Algorithm. In: Proceedings of GECCO 2006, Seattle, USA (2006)

    Google Scholar 

  2. Fogel, G.B., Weekes, D.G., Varga, G., Dow, E.R., Harlow, H.B., Onyia, J.E., Su, C.: Discovery of Sequence Motifs Related to Coexpression of Genes using Evolutionary Computation. Nucleic Acids Research 32(13), 3826–3835 (2004)

    Article  Google Scholar 

  3. Che, D., Song, Y., Rasheed, K.: MDGA: Motif Discovery using a Genetic Algorithm. In: Proceedings of GECCO 2005, pp. 447–452 (2005)

    Google Scholar 

  4. Baile, T.L., Elkan, C.: Unsupervised Learning of Multiple Motifs in Biopolymers using Expectation Maximization. Machine Learning 21, 51–80 (1995)

    Google Scholar 

  5. Hertz, G.Z., Stormo, G.D.: Identifying DNA and Protein Patterns with Statistically Significant Alignment Sets of Multiple Sequences. Bioinformatics 15, 563–577 (1999)

    Article  Google Scholar 

  6. Thijs, G., Marchal, K., Lescot, M., Rombauts, S., De Moore, B., Rouze, P., Moreau, Y.: A Gibbs Sampling Method to Detect Over-represented Motifs in the Upstream Regions of Coexpressed Genes. Journal of Computational Biology 9, 447–464 (2002)

    Article  Google Scholar 

  7. Liu, X., Burtlag, D.L., Liu, J.S.: Bioprospector: Discovering Conserved DNA Motifs in Upstream Regulatory Regions of Co-expressed Genes. In: Pacific Symposium on Biocomputing, vol. 6, pp. 127–138 (2001)

    Google Scholar 

  8. Neuwald, A.F., Liu, J.S., Lawrence, C.E.: Gibbs Motif Sampling: Detection of Bacterial Outer Membrane Protein Repeats. Protein Science 4, 1618–1632 (1995)

    Article  Google Scholar 

  9. Roth, F.P., Hughes, J.D., Estep, P.W., Church, G.M.: Finding DNA Regulatory Motifs within Unaligned Noncoding Sequences Clustered by Whole-Genome mRNA Quantization. Nature Biotechnology 16, 939–945 (1998)

    Article  Google Scholar 

  10. Srinivasa, K.G., Sridharan, K., Shenoy, P.D., Venugopal, K.R., Patnaik, L.M.: A Dynamic Migration Model for Self Adaptive Genetic Algorithms. In: Gallagher, M., Hogan, J.P., Maire, F. (eds.) IDEAL 2005. LNCS, vol. 3578, pp. 555–562. Springer, Heidelberg (2005)

    Google Scholar 

  11. Srinivas, M., Patnaik, L.M.: Binomially Distributed Populations for Modelling GAs. In: Proceedings of Fifth International Conference in Genetic Algorithms, pp. 138–143. Morgan Kauffmann Publishers, San Francisco (1993)

    Google Scholar 

  12. Fraenkel lab downloads, http://jura.wi.mit.edu/fraenkel/download/release_v24/fsafiles/

Download references

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Venugopal, K.R., Srinivasa, K.G., Patnaik, L.M. (2009). Merge Based Genetic Algorithm for Motif Discovery. In: Soft Computing for Data Mining Applications. Studies in Computational Intelligence, vol 190. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-00193-2_18

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-00193-2_18

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-00192-5

  • Online ISBN: 978-3-642-00193-2

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics