Skip to main content

Exploring Structurally Similar Protein Sequence Motifs

  • Chapter
Soft Computing for Data Mining Applications

Part of the book series: Studies in Computational Intelligence ((SCI,volume 190))

  • 865 Accesses

Abstract

Protein sequence motifs are short conserved subsequences common to related protein sequences. Information about motifs is extremely important to the study of biologically significant conserved regions in protein families. These conserved regions can determine the functions and conformation of proteins. Conventionally, recurring patterns of proteins are explored using short protein segments and classification based on similarity measures between the segments. Two protein sequences are classified into the same class if they have high homology in terms of feature patterns extracted through sequence alignment algorithms. Such methodology focuses on finding position specific motifs only. In this chapter, we propose a new algorithm to explore protein sequences by studying subsequences with relative-positioning of amino acids followed by K-Means clustering of fixed-sized segments. The dataset used for our work is most updated among studies for sequence motifs. The various biochemical tests that are found in literature are used to test the significance of motifs and these tests show that motifs generated are of both structural and functional interest. The results suggest that this method may also be applied to closely-related area of finding DNA motifs.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Karp, G.: Cell and Molecular Biology(Concepts and Experiments), 3rd edn. Wiley, New York (2002)

    Google Scholar 

  2. Hulo, N., et al.: Recent Improvements to the Prosite Database. Nucl. Acids Res. (1994)

    Google Scholar 

  3. Kasuya, A., Thornton, J.M.: Three-Dimensional Structure Analysis of Prosite Patterns. Journal of Molecular Biology 286(5), 1673–1691 (1999)

    Article  Google Scholar 

  4. Gribskov, M., McLachlan, A., Eisenberg, D.: Prole Analysis: Detection of Distantly Related Proteins. Proceedings of National Academy of Sciences 84(13), 4355–4358 (1987)

    Article  Google Scholar 

  5. Hertz, G.Z., Stormo, G.D.: Escherichia Colipromoter Sequences: Analysis and Prediction. Methods in Enzymology 273, 30–42 (1996)

    Article  Google Scholar 

  6. Brazma, A., Jonassen, I., Edhammer, I., Gilbert, D.: Approaches to the Automatic Discovery of Patterns in Biosequenes. Journal of Computational Biology 5, 279–305 (1998)

    Article  Google Scholar 

  7. Vanet, A., Marson, L., Sagot, M.F.: Promotor Sequences and Algorithmical Methods for Identifying Them. Research in Microbioloby 150, 779–799 (1999)

    Article  Google Scholar 

  8. Marson, L., Sagot, M.F.: Algorithms for Extracting Structured Motifs using Suffix Tree with an Application to Promoter and Regulatory Site Consensus Identification. Journal of Computational Biology 7, 345–362 (2000)

    Article  Google Scholar 

  9. Han, K.F., Baker, D.: Recurring Local Sequence Motifs in Proteins. Journal of Molecular Biology 251(1), 176–187 (1995)

    Article  Google Scholar 

  10. Han, K.F., Baker, D.: Biophysics - Global Properties of the Mapping Between Local Amino Acid Sequence and Local Structure in Proteins. Proceedings of National Academy Sciences, USA 93, 5814–5818 (1996)

    Article  Google Scholar 

  11. Zhong, W., Altun, G., Harrison, R., Tai, P.C., Pan, Y.: Improved K-Means Clustering Algorithm for Exploring Local Protein Sequence Motifs Representing Common Structural Property. IEEE Transactions on Nanobioscience 4(3) (2005)

    Google Scholar 

  12. Brejova, B., DiMarco, C., Vinar, T., Hidalgo, S.R., Holguin, C., Patten, C.: Finding Patterns in Biological Sequences - Project Report for CS79g. University of Waterloo (2000)

    Google Scholar 

  13. Rigoutsos, L., Floratos, A., Parida, L., Gao, Y., Platt, D.: The Emergency of Pattern Discovery Techniques in Computational Biology. Metabolic Engineering 2, 159–177 (2000)

    Article  Google Scholar 

  14. Durbin, R., Eddy, S., Krough, A., Mitchison, G.: Biological Sequence Analysis: Probabilistic Models of Protein and Nucleic Acid. Cambridge University Press, Cambridge (1998)

    MATH  Google Scholar 

  15. Petsko, G.A., Ringe, D.: Proteins Structure and Function. New Science Press (2003)

    Google Scholar 

  16. Pabo, C.O., Sauer, R.T.: Transcriptional Factors: Structural Familes and Principle of DNA Recognition. Annals of Revolutionary Biochemistry 61, 1053–1095 (1992)

    Article  Google Scholar 

  17. Nelson, H.C.M.: Structure and Function of DNA-Binding Proteins. Current Opinion in Genetics and Development 5, 180–189 (1995)

    Article  Google Scholar 

  18. Scott, M.P., Tamkun, J.W., Hartzell, G.W.: The Structure and Function of the Homeodomain. Biochemistry Biophysics Acta 989(1), 25–48 (1989)

    Google Scholar 

  19. Crochemore, M., Sagot, M.: Motifs in Sequences: Localization and Extraction. In: Hand book of Computational Chemistry. Marcel Dekker Inc., New York (2001)

    Google Scholar 

  20. Heger, A., Lappe, M., Holm, L.: Accurate Detection of Very Sparse Sequence Motifs. In: Proceedings of RECOMB, pp. 139-147 (2003)

    Google Scholar 

  21. Wang, G., Dunbrack Jr., R.L.: Pisces: Recent Improvements to a PDB Sequence Culling Server. Nucleic Acids Research 33 (2005)

    Google Scholar 

  22. Kabsh, W., Sander, C.: Dictionary of Protein Secondary Structure: Pattern Recognition of Hydrogen-Bonded and Geometrical Features. Biopolymers 22, 2577–2637 (1983)

    Article  Google Scholar 

  23. Sander, C., Schneider, R.: Database of Homology Derived Protein Structures and the Structral Meaning of Sequence Alignment. Proteins Structural Functional Genetics 7(2), 121–135 (1967)

    Google Scholar 

  24. Berg, J.M., Tymoczko, J.L., Stryer, L.: Biochemistry, 5th edn. W H Freeman, New York (2002)

    Google Scholar 

  25. Robertson, A.D.: Intramolecular Interactions at Protein Surfaces and Their Impact on Protein Function. Trends Biochemistry Sciences 27, 521–526 (2002)

    Article  Google Scholar 

  26. Kyte, J., Doolitle, R.F.: A Simple Method for Displaying the Hydropathic Character of Protein. Journal of Molecular Biology (157), 105–132 (1982)

    Google Scholar 

  27. Zimmerman, J.M., Eliezer, N., Simha, R.: The Characterization of Amino Acid Sequences in Proteins by Statistical Methods. Journal of Theoretical Biology (2001)

    Google Scholar 

  28. Finer-Moore, J., Stroud, R.M.: Amphipathic Analysis and Possible Formation of the Ion Channel in an Acetocholine Receptor. Proceedings of National Academy of Sciences, USA 81(1), 155–159 (1984)

    Article  Google Scholar 

  29. Segrest, J.P., De Loof, H., Dohlman, L.G., Brouilette, C.G., Anantharamaiah, G.M.: Amphipathic Helix Motif: Classes and Properties. Protein Structural Functional Genetics 8(2), 103–117 (1990)

    Article  Google Scholar 

  30. Kaiser, E.T., Kezdy, F.J.: Amphiphilic Secondary Structure: Design of Peptide Hormones. Science 223, 249–255 (1984)

    Article  Google Scholar 

Download references

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Venugopal, K.R., Srinivasa, K.G., Patnaik, L.M. (2009). Exploring Structurally Similar Protein Sequence Motifs. In: Soft Computing for Data Mining Applications. Studies in Computational Intelligence, vol 190. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-00193-2_16

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-00193-2_16

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-00192-5

  • Online ISBN: 978-3-642-00193-2

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics