Abstract
In this work we examine the problem of finding biological motifs in DNA databases. The problem was solved by applying MBMEDA, which is a evolutionary method based on the Estimation of Distribution Algorithm (EDA). Though it assumes statistical independence between the main variables of the problem, results were quite satisfactory when compared with those obtained by other methods; in some cases even better. Its performance was measured by using two metrics: precision and recall, both taken from the field of information retrieval. The comparison involved searching a motif on two types of DNA datasets: synthetic and real. On a set a five real databases the average values of precision and recall were 0.866 and 0.798, respectively.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Stormo, G.: DNA binding sites: representation and discovery. Bioinformatics 16(1), 16–23 (2000)
Liu, X.: Bioprospector: Discovering Conserved DNa Motifs in Upstream Regulatory Regions of Co-expressed Genes. In: Pacific Symposium on Biocomputing, vol. 6, pp. 127–138 (2001)
Hertz, Z., Stormo, G.: Identifying DNA and Protein Patterns with Statistically Significant Aligments of Multiple Sequences. Bioinformatics 15(7), 563–577 (1999)
Eiben, E. , Smith, J. : What Is an Evolutionary Algorithm. Introduction to Evolutionary Computing. Springer, New York (2003)
Endika, B., Larrañaga, P., Bloch, I., Perchant, A.: Estimation of Distribution Algorithms: a New Evolutionary Computation Approach for Graph Matching Problems. Energy Minimization Methods in Computer Vision and Pattern Recognition, 454–469 (2001)
Gang, L., Chan, T., Leung, K., Hong, K.: An Estimation of Distribution Algorithm for Motif Discovery. Evolutionary Computation, 2411–2418 (2008)
Wei, Z.: GAME: Detecting Cis-regulatory Elements Using a Genetic Algorithm. Bioinformatics 22(13), 1577–1584 (2006)
Sinha, S.: On counting position weight matrix matches in a sequence, with application to discriminative motif finding. Bioinformatics 22(14), 454–463 (2006)
Schneider, T., Stormo, G., Gold, L., Ehrenfeucht, A.: Information Content of Binding Sites on Nucleotide Sequences. Journal of Molecular Biology 188(3), 415–431 (1986)
Shannon, C.: A Mathematical Theory of Communication. Bell Syst., Techn. J. 27, 379–423 (1948)
Jordán, I., Jordán, C.: Aplicación de Algoritmos Evolutivos a la búsqueda de motivos biológicos en bases de regiones promotoras de ADN. Revista Matemática ICM, 33–42 (2012)
Fogel, D.: Evolutionary Computation: Toward a new Philosophy in Machine Intelligence. IEEE Press (1995)
Manning, D., Raghavan, P., Schutze, H.: Introduction to Information Retrieval, pp. 151–158. Cambridge UP, New York (2008)
Schneider, T., Stephens, R.: Sequence Logos: A New Way to Display Consensus Sequences. Nucleic Acids Res. 18(20), 6097–6100 (1990)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Jordán, C.I., Jordán, C.J. (2015). MBMEDA: An Application of Estimation of Distribution Algorithms to the Problem of Finding Biological Motifs. In: Ferrández Vicente, J., Álvarez-Sánchez, J., de la Paz López, F., Toledo-Moreo, F., Adeli, H. (eds) Artificial Computation in Biology and Medicine. IWINAC 2015. Lecture Notes in Computer Science(), vol 9107. Springer, Cham. https://doi.org/10.1007/978-3-319-18914-7_5
Download citation
DOI: https://doi.org/10.1007/978-3-319-18914-7_5
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-18913-0
Online ISBN: 978-3-319-18914-7
eBook Packages: Computer ScienceComputer Science (R0)