An Algorithm to Find All Identical Motifs in Multiple Biological Sequences

  • Ashish Kishor Bindal
  • R. Sabarinathan
  • J. Sridhar
  • D. Sherlin
  • K. Sekar
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6282)


Sequence motifs are of greater biological importance in nucleotide and protein sequences. The conserved occurrence of identical motifs represents the functional significance and helps to classify the biological sequences. In this paper, a new algorithm is proposed to find all identical motifs in multiple nucleotide or protein sequences. The proposed algorithm uses the concept of dynamic programming. The application of this algorithm includes the identification of (a) conserved identical sequence motifs and (b) identical or direct repeat sequence motifs across multiple biological sequences (nucleotide or protein sequences). Further, the proposed algorithm facilitates the analysis of comparative internal sequence repeats for the evolutionary studies which helps to derive the phylogenetic relationships from the distribution of repeats.


Sequence motifs nucleotide and protein sequences identical motifs dynamic programming direct repeat and phylogenetic relationships 


  1. 1.
    D’Haeseleer, P.: What are DNA sequence motifs? Nat. Biotechnol. 24, 423–425 (2006)Google Scholar
  2. 2.
    Kumar, C., Kumar, N., Sarani, R., Balakrishnan, N., Sekar, K.: A Method to find Sequentially Separated Motifs in Biological Sequences (SSMBS). In: Chetty, M., Ngom, A., Ahmad, S. (eds.) PRIB 2008. LNCS (LNBI), vol. 5265, pp. 13–27. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  3. 3.
    Hulo, N., Sigrist, C.J., Le Saux, V., Langendijk-Genevaux, P.S., Bordoli, L., Gattiker, A., De Castro, E., Bucher, P., Bairoch, A.: Recent improvements to the PROSITE database. Nucl. Acids Res. 32, D134–D137 (2004)CrossRefGoogle Scholar
  4. 4.
    Huang, J.Y., Brutlag, D.L.: The EMOTIF database. Nucl. Acids Res. 29, 202–204 (2001)CrossRefPubMedPubMedCentralGoogle Scholar
  5. 5.
    Zdobnov, E.M., Apweiler, R.: InterProScan–an integration platform for the signature-recognition methods in InterPro. Bioinformatics 17, 847–848 (2001)CrossRefPubMedGoogle Scholar
  6. 6.
    Bailey, T.L., Elkan, C.: Fitting a mixture model by expectation maximization to discover motifs in biopolymers. Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology 2, 28–36 (1994)Google Scholar
  7. 7.
    Rigoutsos, I., Floratos, A.: Combinatorial pattern discovery in biological sequences: the TEIRESIAS algorithm. Bioinformatics 14, 55–67 (1998)CrossRefPubMedGoogle Scholar
  8. 8.
    Werner, T.: Model for prediction and recognition of eukaryotic promoters. Mamm. Genome 10, 168–175 (1999)CrossRefPubMedGoogle Scholar
  9. 9.
    VanHelden, J., Andre, B., Collado-Vides, J.: Extracting Regulatory Sites from the Upstream Region of Yeast Genes by Computational Analysis of Oligonucleotide Frequencies. J. Mol. Biol. 281, 827–842 (1998)CrossRefGoogle Scholar
  10. 10.
    Koonin, E.V., Mushegian, A.R., Galperin, M.Y., Walker, D.R.: Comparison of archeal and bacterial genomes: Computer analysis of protein sequence predicts novel function and suggests chimeric origins for the archaea. Mol. Microbiol. 25, 619–637 (1997)CrossRefPubMedGoogle Scholar
  11. 11.
    Boby, T., Patch, A.M., Aves, S.J.: TRbase: a database relating tandem repeats to disease genes in the human genome. Bioinformatics 21, 811–816 (2005)CrossRefPubMedGoogle Scholar
  12. 12.
    Mojica, F.J., Diez-Villasenor, C., Soria, E., Juez, G.: Biological significance of a family of regularly spaced repeats in the genomes of archaea, bacteria and mitochondria. Mol. Microbiol. 36, 244–246 (2000)CrossRefPubMedGoogle Scholar
  13. 13.
    Van de Lagemaat, L.N., Gagnier, L., Medstrand, P., Mager, D.L.: Genomic deletions and precise removal of transposable elements mediated by short identical DNA segments in primates. Genome Res. 15, 1243–1249 (2005)CrossRefPubMedPubMedCentralGoogle Scholar
  14. 14.
    Wu, T.T., Miller, M.R., Perry, H.M., Kabat, E.A.: Long identical repeats in the mouse gamma 2b switch region and their implications for the mechanism of class switching. EMBO J. 3, 2033–2040 (1984)PubMedPubMedCentralGoogle Scholar
  15. 15.
    Banerjee, N., Chidambarathanu, N., Sabarinathan, R., Michael, D., Vasuki Ranjani, C., Balakrishnan, N., Sekar, K.: An Algorithm to Find Similar Internal Sequence Repeats. Curr. Sci. 97, 1345–1349 (2009)Google Scholar
  16. 16.
    Sarani, R., Udayaprakash, N.A., Subashini, R., Mridula, P., Yamane, T., Sekar, K.: Large cryptic internal sequence repeats in protein structures from Homo sapiens. J. Biosciences 34, 103–112 (2009)CrossRefGoogle Scholar
  17. 17.
    Sabarinathan, R., Basu, R., Sekar, K.: ProSTRIP: A method to find similar structural repeats in three-dimensional protein structures. Comput. Biol. Chem. 34, 126–130 (2010)CrossRefPubMedGoogle Scholar
  18. 18.
    Heringa, J.: Detection of internal repeats: How common are they? Curr. Opin. Struct. Biol. 8, 338–345 (1998)CrossRefGoogle Scholar
  19. 19.
    Djian, P.: Evolution of simple repeats in DNA and their relation to human diseases. Cell 94, 155–160 (1998)CrossRefPubMedGoogle Scholar
  20. 20.
    Pons, T., Gomez, R., Chinea, G., Valencia, A.: Beta-propellers: associated functions and their role in human diseases. Curr. Med. Chem. 10, 505–524 (2003)CrossRefPubMedGoogle Scholar
  21. 21.
  22. 22.
    de Castro, E., Sigrist, C.J., Gattiker, A., Bulliard, V., Langendijk-Genevaux, P.S., Gasteiger, E., Bairoch, A., Hulo, N.: ScanProsite: detection of PROSITE signature matches and ProRule-associated functional and structural residues in proteins. Nucl. Acids Res. 34, W362–W365 (2006)CrossRefGoogle Scholar
  23. 23.
    Schultz, J., Milpetz, F., Bork, P., Ponting, C.P.: SMART, a simple modular architecture research tool: identification of signaling domains. Proc. Natl. Acad. Sci. USA 95, 5857–5864 (1998)CrossRefPubMedPubMedCentralGoogle Scholar
  24. 24.
  25. 25.
    Hughes, J.D., Estep, P.W., Tavazoie, S., Church, G.M.: Computational identification of cis-regulatory elements associated with groups of functionally related genes in Saccharomyces cerevisiae. J. Mol. Biol. 296, 1205–1214 (2000)CrossRefPubMedGoogle Scholar
  26. 26.
    Neduva, V., Linding, R., Su-Angrand, I., Stark, A., de Massi, F., Gibson, T.J., Lewis, J., Serrano, L., Russell, R.B.: Systematic discovery of new recognition peptides mediating protein interaction networks. PLoS Biol. 3, e405 (2005)CrossRefGoogle Scholar
  27. 27.
    Favorov, A.V., Gelfand, M.S., Gerasimova, A.V., Ravcheev, D.A., Mironov, A.A., Makeev, V.J.: A Gibbs sampler for identification of symmetrically structured, spaced DNA motifs with improved estimation of the signal length. Bioinformatics 21, 2240–2245 (2005)CrossRefPubMedGoogle Scholar
  28. 28.
    Banerjee, N., Chidambarathanu, N., Michael, D., Balakrishnan, N., Sekar, K.: An Algorithm to Find All Identical Internal Sequence Repeats. Curr. Sci. 95, 188–195 (2008)Google Scholar
  29. 29.
    Sorek, R., Kunin, V., Hugenholtz, P.: CRISPR - a widespread system that provides acquired resistance against phages in bacteria and archaea. Nat. Rev. Microbiol., 181–186 (2008)Google Scholar
  30. 30.
    Altschul, S.F., Gish, W., Miller, W., Myers, E.W., Lipman, D.J.: Basic local alignment search tool. J. Mol. Biol. 215, 403–410 (1990)CrossRefPubMedGoogle Scholar
  31. 31.
    Thompson, J.D., Higgins, D.G., Gibson, T.J.: CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position specific gap penalties and weight matrix choice. Nucl. Acids Res. 22, 4673–4680 (1994)CrossRefPubMedPubMedCentralGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Ashish Kishor Bindal
    • 1
  • R. Sabarinathan
    • 1
  • J. Sridhar
    • 2
  • D. Sherlin
    • 1
  • K. Sekar
    • 1
  1. 1.Bioinformatics Centre (Centre of excellence in Structural Biology and Bio-computing)Indian Institute of ScienceBangaloreIndia
  2. 2.Center of Excellence in Bioinformatics, School of BiotechnologyMadurai Kamaraj UniversityMaduraiIndia

Personalised recommendations