Protein-Protein Binding Affinity Prediction Based on an SVR Ensemble
Accurately predicting generic protein-protein binding affinities (PPBA) is essential to analyze the outputs of protein docking and may help infer real status of cellular protein-protein interaction sub-networks. However, accurate PPBA prediction is still extremely challenging. Machine learning methods are promising to address this problem. We propose a two-layer support vector regression (TLSVR) model to implicitly capture binding contributions that are hard to explicitly model. The TLSVR circumvents both the descriptor compatibility problem and the need for problematic modeling assumptions. Input features for TLSVR in first layer are scores of 2209 interacting atom pairs within each distance bin. The base SVRs are combined by the second layer to infer the final affinities. Leave-one-out validation on our heterogeneous data shows that the TLSVR method obtains a very good result of R=0.80 and SD=1.32 with real affinities. Comparison experiment further demonstrates that TLSVR is superior to the previous state-of-art methods in predicting generic PPBA.
KeywordsProtein-protein interaction affinity machine learning two-layer support vector machine potential of mean force
Unable to display preview. Download preview PDF.
- 1.Kollman, P.A., Massova, I., Reyes, C., Kuhn, B., Huo, S., Chong, L., Lee, M., Lee, T., Duan, Y., Wang, W., Donini, O., Cieplak, P., Srinivasan, J., Case, D.A., Cheatham, T.E.: 3rd: Calculating Structures and Free Energies of Complex Molecules: Combining Molecular Mechanics and Continuum Models. Acc. Chem. Res. 33, 889–897 (2000)CrossRefGoogle Scholar
- 12.Xia, J.-F., Zhao, X.-M., Huang, D.-S.: Predicting Protein-protein Interactions from Protein Sequences Using Meta Predictor. Amino. Acids 39, 1595–1599Google Scholar
- 13.Teramoto, R., Kashima, H.: Prediction of Protein-ligand Binding Affinities Using Multiple Instance Learning. Journal of Molecular Graphics and Modelling 29, 492–497Google Scholar