Approximate Pattern Matching for DNA Sequence Data

  • Nagamma Patil
  • Durga Toshniwal
  • Kumkum Garg
Part of the Communications in Computer and Information Science book series (CCIS, volume 142)


In real world biological applications, most relevant sequences are “similar” instead of exactly the same. It is therefore useful to search patterns using soft computing techniques. Our proposed work uses the fuzzy approach for approximate pattern matching. Given a database of sequences, our aim is to find total number of candidate patterns that will approximately match to the query pattern with specified fault tolerance.We show the complete analysis of approximate matching patterns by fuzzy and exact matching approach. We use DNA sequences of bacteria downloaded from National Center for Biotechnology Information (NCBI).


Soft computing fuzzy sets approximate matching exact matching DNA sequence 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Mitra, S., Pal, S.K., Mitra, P.: Data Mining in Soft Computing Framework: A Survey. IEEE Trans. on Neural Networks 13(1), 3–14 (2002)CrossRefGoogle Scholar
  2. 2.
    Han, J., Kamber, M.: Data Mining: Concepts and Techniques. Academic Press, New York (2001)zbMATHGoogle Scholar
  3. 3.
    Gately, E.: Neural Network for Financial Forecasting. John Wiley & Sons, Chichester (1996)Google Scholar
  4. 4.
    Misener, S., Krawetz, S.A.: Bioinformatics: Methods and Protocols. Human Press Inc. (2000)Google Scholar
  5. 5.
    Gusfield, D.: Algorithms on Strings, Trees, and Sequences. Cambridge University Press, Cambridge (1997)CrossRefzbMATHGoogle Scholar
  6. 6.
    Luscombe, N.M., Greenbaum, D., Gerstein, M.: What is bioinformatics? A proposed definition and overview of the field. Methods Inf. Med. 40, 346–358 (2001)Google Scholar
  7. 7.
    Salzberg, S., Searls, D., Kasif, S.: Computational Methods in Molecular Biology. Elsevier, Amsterdam (1998)zbMATHGoogle Scholar
  8. 8.
  9. 9.
    Boyer, R.S., Moore, J.S.: A fast string searching algorithm. Communications of the ACM 20(10), 762–772 (1977)CrossRefzbMATHGoogle Scholar
  10. 10.
    Knuth, D.E., Morris Jr., J.H., Pratt, V.R.: Fast pattern matching in strings. SIAM J. of Computing 6(2), 323–350 (1977)MathSciNetCrossRefzbMATHGoogle Scholar
  11. 11.
    Navarro, G.: A guided tour to approximate string matching. ACM Computing Surveys 33, 31–88 (2001)CrossRefGoogle Scholar
  12. 12.
    Smith, T., Waterman, M.: Identification of common molecular subsequences. J. of Mol. Biol. 147, 195–197 (1981)CrossRefGoogle Scholar
  13. 13.
    Altschul, S., Gish, W., Miller, W., Myers, E., Lipman, D.: Basic local alignment search tool. J. of Mol. Biol. 215, 403–410 (1990)CrossRefGoogle Scholar
  14. 14.
  15. 15.
    Chang, B.C.H., Halgamuge, S.K.: Approximate Symbolic Pattern Matching for Protein Sequence Data. Int. J. of Approximate Reasoning 32(2-3), 171–186 (2003)CrossRefzbMATHGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Nagamma Patil
    • 1
  • Durga Toshniwal
    • 1
  • Kumkum Garg
    • 1
  1. 1.Department of Electronics and Computer EngineeringIndian Institute of Technology RoorkeeRoorkeeIndia

Personalised recommendations