Finding Conserved Regions in Protein Structures Using Support Vector Machines and Structure Alignment

  • Tatsuya Akutsu
  • Morihiro Hayashida
  • Takeyuki Tamura
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7632)


This paper proposes a novel method for finding conserved regions in three-dimensional protein structures. The method combines support vector machines (SVMs), feature selection and protein structure alignment. For that purpose, a new feature vector is developed based on structure alignment for fragments of protein backbone structures. The results of preliminary computational experiments suggest that the proposed method is useful to find common structural fragments in similar proteins.


Support Vector Machine Feature Vector Feature Selection Method Structure Alignment Recursive Feature Elimination 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Akutsu, T.: Protein structure alignment using dynamic programming and iterative improvement. IEICE Trans. Inf. Syst. E79-D, 1629–1636 (1996)Google Scholar
  2. 2.
    Akutsu, T., Halldórsson, M.M.: On the approximation of largest common subtrees and largest common point sets. Theoret. Comp. Sci. 233, 33–50 (2000)CrossRefzbMATHGoogle Scholar
  3. 3.
    Andreeva, A., Howorth, D., Brenner, S.E., Hubbard, T.J.P., Chothia, C., Murzin, A.G.: SCOP database in 2004: refinements integrate structure and sequence family data. Nucleic Acids Res. 32, D226–D229 (2004)Google Scholar
  4. 4.
    Borgwardt, K.M., Ong, C.S., Schönauer, S., Vishwanathan, S.V.N., Smola, A.J., Kriegel, H-P.: Protein function prediction via graph kernels. Bioinformatics 21, i47–i56 (2005)Google Scholar
  5. 5.
    Chandonia, J.M., Hon, G., Walker, N.S., Lo Conte, L., Koehl, P., Levitt, M., Brenner, S.E.: The ASTRAL compendium in 2004. Nucleic Acids Res. 32, D189–D192 (2004)Google Scholar
  6. 6.
    Cortes, C., Vapnik, V.: Support-vector networks. Machine Learning 20, 273–297 (1995)zbMATHGoogle Scholar
  7. 7.
    Dobson, P.D., Doig, A.J.: Distinguishing enzyme structures from non-enzymes without alignment. J. Mol. Biol. 330, 771–783 (2003)CrossRefGoogle Scholar
  8. 8.
    Guyon, I., Weston, J., Barnhill, S., Vapnik, V.: Gene selection for cancer classification using support vector machines. Machine Learning 46, 389–422 (2002)CrossRefzbMATHGoogle Scholar
  9. 9.
    Holm, L., Sander, C.: Protein structure comparison by alignment of distance matrices. J. Mol. Biol. 233, 123–138 (1993)CrossRefGoogle Scholar
  10. 10.
    Joachims, T.: Making large-scale SVM learning practical. In: Schölkopf, B., Burges, C., Smola, A. (eds.) Advances in Kernel Methods - Support Vector Learning, pp. 41–56. MIT Press (1999)Google Scholar
  11. 11.
    Leslie, C., Eskin, E., Noble, W.S.: The spectrum kernel: a string kernel for SVM protein classification. In: Proc. Pacific Symposium on Biocomputing, vol. 7, pp. 564–575 (2002)Google Scholar
  12. 12.
    Lupyan, D., Leo-Macias, A., Ortiz, A.R.: A new progressive-iterative algorithm for multiple structure alignment. Bioinformatics 21, 3255–3263 (2005)CrossRefGoogle Scholar
  13. 13.
    Pearl, F.M., Bennett, C.F., Bray, J.E., Harrison, A.P., Martin, N., Shepherd, A., Sillitoe, I., Thornton, J., Orengo, C.A.: The CATH database: an extended protein family resource for structural and functional genomics. Nucleic Acids Res. 31, 452–455 (2003)CrossRefGoogle Scholar
  14. 14.
    Qiu, J., Ben-Hur, A., Vert, J.-P., Noble, W.S.: A structural alignment kernel for protein structures. Bioinformatics 23, 1090–1098 (2007)CrossRefGoogle Scholar
  15. 15.
    Shawe-Taylor, J., Cristianini, N.: Kernel Methods for Pattern Analysis. Cambridge Univ. Press (2004)Google Scholar
  16. 16.
    Shindyalov, I.N., Bourne, P.E.: Protein structure alignment by incremental combinatorial extension (CE) of the optimal path. Protein Eng. 11, 739–747 (1998)CrossRefGoogle Scholar
  17. 17.
    Ye, Y., Godzik, A.: Multiple flexible structure alignment using partial order graphs. Bioinformatics 21, 2362–2369 (2005)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Tatsuya Akutsu
    • 1
  • Morihiro Hayashida
    • 1
  • Takeyuki Tamura
    • 1
  1. 1.Bioinformatics Center, Institute for Chemical ResearchKyoto UniversityUjiJapan

Personalised recommendations