A Consensus Approach for Identification of Protein-Protein Interaction Sites in Homo Sapiens

  • Brijesh K. Sriwastava
  • Subhadip Basu
  • Ujjwal Maulik
  • Dariusz Plewczynski
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8251)


The physico-chemical properties of interaction interfaces have a crucial role in characterization of protein–protein interactions. Given the unbound structure of a protein and the fact that it forms a complex with another known protein, the objective of this work is to identify the residues that are involved in the interaction. We attempt to predict interaction sites in protein complexes using local composition of amino acids together with their physico-chemical characteristics. The local sequence segments are dissected from the protein sequences using sliding window of 21 amino acids. The list of LSSs is passed to the support vector machine (SVM) predictor, which identifies interacting residue pairs considering their inter-atom distances. Three different SVM predictors are designed that generate area under ROC curve (AUC), Recall and Precision optimized results. Finally a 3-star consensus strategy is designed to analyze 33 hetero-complexes of the Homo sapiens organism. The consensus approach generates the AUC score of 0.7376, which is superior to the individual SVM classification results.


protein-protein interactions machine learning support vector machine consensus approach 


  1. 1.
    Chelliah, V., Chen, L., Blundell, T., Lovell, S.: Distinguishing structural and functional restraints in evolution inorder to identify interaction sites. Journal of Molecular Biology 342, 1487–1504 (2004)CrossRefGoogle Scholar
  2. 2.
    Uetz, P., Giot, L., Cagney, G.: A comprehensive analysis of protein-protein interactions in Saccharomyces cerevisiae. Nature 403, 623–627 (2000)CrossRefGoogle Scholar
  3. 3.
    Yuen, H., Gruhler, A., Heilbut, A.: Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometry. Nature 415, 180–183 (2002)CrossRefGoogle Scholar
  4. 4.
    Chen, X., Jeong, J.: Sequence-based Prediction of Protein Interaction Sites with an Integrative Method. Bioinformatics 25(5), 585–591 (2009)CrossRefGoogle Scholar
  5. 5.
    Saha, I., Maulik, U., Bandyopadhyay, S., Plewczynski, D.: Fuzzy Clustering of Physicochemical and Biochemical Properties of Amino Acids. Amino Acids (2011)Google Scholar
  6. 6.
    Deng, L., Guan, J., Dong, Q., Zhou, S.: Prediction of protein-protein interaction sites using an ensemble method. BMC Bioinformatics 10, 426 (2009)CrossRefGoogle Scholar
  7. 7.
    Sriwastava, B.K., Basu, S., Maulik, U., Plewczynski, D.: PPIcons: identification of protein-protein interaction sites in selected organisms. Journal of Molecular Modeling (accepted for publication, 2013)Google Scholar
  8. 8.
    Sriwastava, B.K., Basu, S., Maulik, U., Plewczynski, D.: Prediction of E.coli Protein-Protein Interaction Sites Using Inter-Residue Distances and High-Quality-Index Features. In: Satapathy, S.C., Avadhani, P.S., Abraham, A. (eds.) Proceedings of the InConINDIA 2012. AISC, vol. 132, pp. 837–844. Springer, Heidelberg (2012)Google Scholar
  9. 9.
    Berman, H., Westbrook, J., Feng, Z., Gilliland, G., Bhat, T., Weissig, H., Shindyalov, I., Bourne, P.: The Protein Data Bank. Nucleic Acids Res. 28, 235–242 (2000)CrossRefGoogle Scholar
  10. 10.
    Salwinski, L., Miller, C.S., Smith, A.J., Pettit, F.K., Bowie, J.U., Eisenberg, D.: The Database of Interacting Proteins: 2004 update. Nucleic Acids Research 32, D449–D451 (2004)Google Scholar
  11. 11.
    Singh, R., Park, D., Xu, J., Hosur, R., Berger, B.: Struct2Net: a web service to predict protein–protein interactions using a structure-based approach. Nucleic Acids Research 38, W508–W515 (2010)Google Scholar
  12. 12.
    Wang, B., Chen, P., Huang, D.-S., Lia, J.-J., Lokc, T.-M., Lyud, M.R.: Predicting protein interaction sites from residue spatial sequence profile and evolution rate. Federation of European Biochemical Societies Letters 580, 380–384 (2006)CrossRefGoogle Scholar
  13. 13.
    Nguyen, M.N., Rajapakse, J.C.: Protein-Protein Interface Residue Prediction with SVM Using Evolutionary Profiles and Accessible Surface Areas. IEEE (2006)Google Scholar
  14. 14.
    Bordner, A.J., Abagyan, R.: Statistical Analysis and Prediction of Protein–Protein Interfaces. PROTEINS: Structure, Function, and Bioinformatics 60, 353–366 (2005)CrossRefGoogle Scholar
  15. 15.
    Plewczynski, D.: Brainstorming: weighted voting prediction of inhibitors for protein targets. Journal of Molecular Modeling 17, 2133–2141 (2010)CrossRefGoogle Scholar
  16. 16.
    Sengupta, D., Maulik, U., Bandyopadhyay, S.: Weighted Markov Chain Based Aggregation of Biomolecule Orderings. IEEE/ACM Transactions on Computational Biology and Bioinformatics 9, 924–933 (2012)CrossRefGoogle Scholar
  17. 17.
    Maulik, U., Chakraborty, D.: A self-trained ensemble with semisupervised SVM: An application to pixel classification of remote sensing imagery. Pattern Recognition 44, 615–623 (2011)CrossRefzbMATHGoogle Scholar
  18. 18.
    Maulik, U., Mukhopadhyay, A., Bandyopadhyay, S.: Combining Pareto-optimal clusters using supervised learning for identifying co-expressed genes. BMC Bioinformatics 10, 27 (2009)CrossRefGoogle Scholar
  19. 19.
    Maulik, U., Bandyopadhyay, S., Wang, J.T.: Computational Intelligence and Pattern Analysis in Biology Informatics. Wiley. com. (2011)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Brijesh K. Sriwastava
    • 1
  • Subhadip Basu
    • 2
  • Ujjwal Maulik
    • 2
  • Dariusz Plewczynski
    • 3
  1. 1.Department of Computer Science and EngineeringGovernment College of Engineering and Leather TechnologyKolkataIndia
  2. 2.Department of Computer Science and EngineeringJadavpur UniversityKolkataIndia
  3. 3.Interdisciplinary Centre for Mathematical and Computational ModellingUniversity of WarsawWarsawPoland

Personalised recommendations