Abstract
Hot regions are the key factor to maintain stability and coordination of protein-protein interactions. In this paper, combining evolutionary information and support vector machine (SVM), we have developed an improved method for predicting binding sites in a protein sequence. The prediction models developed in this study have been trained and tested on binding protein chains and evaluated using fold cross validation technique. The performance of this SVM model further improved. Based on the predicted hot spots, DBSCAN method is used to predict the hot regions in protein-protein interactions. The experimental results demonstrate that the proposed method improves the predictive accuracy of hot regions and is more reliable compared with previous method.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Hsu, C.M., Chen, C.Y., Liu, B.J.: MAGIIC-PRO: detecting functional signatures by efficient discovery of long patterns in protein sequences. Nucleic Acids Res. 34, W356–W361 (2006)
Casari, G., Sander, C., Valencia, A.: A method to predict functional residues in proteins. Nat. Struct. Biol. 2, 171–178 (1995)
Armon, A., Graur, D., Ben-Tal, N.: ConSurf: an algorithmic tool for the identification of functional regions in proteins by surface mapping of phylogenetic information. J. Mol. Biol. 307, 447–463 (2001)
Hsu, C.M., Chen, C.Y., Liu, B.J., Huang, C.C.: Identification of hot regions in protein-protein interactions by sequential pattern mining. BMC Bioinform. 8(Suppl 5), S8 (2007)
Keskin, O., Ma, B.Y., Mol, R.J.: Hot regions in protein-protein interactions: the organization and contribution of structurally conserved hot spot residues. J. Mol. Biol. 345, 1281–1294 (2005)
Tuncbag, N., Gursoy, A., Keskin, O.: Identification of computational hot spots in protein interfaces: Combining solvent accessibility and inter-residue potentials improves the accuracy. Bioinformatics 25, 1513–1520 (2009)
Ezkurdia, I., Bartoli, L., Fariselli, P., Casadio, R., Valencia, A., et al.: Progress and challenges in predicting protein-protein interaction sites. Brief Bioinform. 10, 233–246 (2009)
Lise, S., Buchan, D., Pontil, M., Jones, D.T.: Predictions of hot spot residues at protein-protein interfaces using support vector machines. PLoS ONE 6, e16774 (2011)
Lise, S., Archambeau, C., Pontil, M., Jones, D.T.: Prediction of hot spot residues at protein-protein interfaces by combining machine learning and energy-based methods. BMC Bioinform. 10, 365 (2009)
Tuncbag, N., Keskin, O., Gursoy, A.: HotPoint: hot spot prediction server for protein interfaces. Nucleic Acids Res. 38, 402–406 (2010)
Engin, C., Gursoy, A., Keskin, O.: Analysis of hot region organization in hub proteins. Ann. Biomed. Eng. 38, 2068–2078 (2010)
Carles, P., Fabian, G., Juan, F.: Prediction of protein-binding areas by small world residue networks and application to docking. BMC Bioinform. 12, 378–388 (2011)
Dongfang, N., Xiaolong, Z.: Prediction of hot regions in protein-protein interactions based on complex network and community. In: The 4thWorkshop on Interative Data Analysis in System Biology (IDASB), in conjunction with BIBM2013 (2013)
Thorn, K.S., Bogan, A.A.: ASEdb: a database of ananine mutations and their effects on the free energy of binding in protein interactions. Bioinformatics 17(3), 284–285 (2001)
Kyu-il, C., Dongsup, K., Doheon, L.: A feature-based approach to modeling protein-protein interaction hot spots. Nucleic Acids Res. 37(8), 2672–2678 (2009)
Noble, W.S.: What is a support vector machine? Nat. Biotechnol. 24, 1565–1567 (2006)
Susmita, B., Ranjan, B.: NSGA-II based multi-objective evolutionary algorithm for a multi-objective supply chain problem. In: IEEE International Conference on Advances In Engineering, Science and Management (ICAESM-2012), pp. 126–130 (2012)
Campello, R.J.G.B., Moulavi, D., Zimek, A., Sander, J.: A framework for semi-supervised and unsupervised optimal extraction of clusters from hierarchies. Data Min. Knowl. Disc. 27(3), 344 (2013)
Campello, R.J., Moulavi, D., Sander, J.: Density-based clustering based on hierarchical density estimates. In: Pei, J., Tseng, V.S., Cao, L., Motoda, H., Xu, G. (eds.) PAKDD 2013, Part II. LNCS, vol. 7819, pp. 160–172. Springer, Heidelberg (2013)
Ahmad, S., Keskin, O., Sarai, A., Nussinov, R.: Protein–DNA interactions: structural, thermodynamic and clustering patterns of conserved residues in DNA-binding proteins. Nucleic Acids Res. 36, 5922–5932 (2008)
Tuncbag, N., Keskin, O., Gursoy, A.: HotPoint: hot spot prediction server for protein interfaces. Nucleic Acids Res. 38(2), 402–406 (2010)
Nan, D.F., Zhang, X.L.: Prediction of hot regions in protein-protein interactions based on complex network and community detection. In: IEEE International conference on Bioinformatics and Biomedicine, pp. 17–23 (2013)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Lin, X., Yang, H., Ye, J. (2015). Identification of Hot Regions in Protein-Protein Interactions Based on SVM and DBSCAN. In: Huang, DS., Jo, KH., Hussain, A. (eds) Intelligent Computing Theories and Methodologies. ICIC 2015. Lecture Notes in Computer Science(), vol 9226. Springer, Cham. https://doi.org/10.1007/978-3-319-22186-1_38
Download citation
DOI: https://doi.org/10.1007/978-3-319-22186-1_38
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-22185-4
Online ISBN: 978-3-319-22186-1
eBook Packages: Computer ScienceComputer Science (R0)