Abstract
The similarity comparison of binding sites based on amino acid between different proteins can facilitate protein function identification. However, Binding site usually consists of several crucial amino acids which are frequently dispersed among different regions of a protein and consequently make the comparison of binding sites difficult. In this study, we introduce a new method, named as chemical and geometric similarity of binding site (CGS-BSite), to compute the ligand binding site similarity based on discrete amino acids with maximum-weight bipartite matching algorithm. The principle of computing the similarity is to find a Euclidean Transformation which makes the similar amino acids approximate to each other in a geometry space, and vice versa. CGS-BSite permits site and ligand flexibilities, provides a stable prediction performance on the flexible ligand binding sites. Binding site prediction on three test datasets with CGS-BSite method has similar performance to Patch-Surfer method but outperforms other five tested methods, reaching to 0.80, 0.71 and 0.85 based on the area under the receiver operating characteristic curve, respectively. It performs a marginally better than Patch-Surfer on the binding sites with small volume and higher hydrophobicity, and presents good robustness to the variance of the volume and hydrophobicity of ligand binding sites. Overall, our method provides an alternative approach to compute the ligand binding site similarity and predict potential special ligand binding sites from the existing ligand targets based on the target similarity.
Similar content being viewed by others
Abbreviations
- CGS-BSite:
-
Chemical and geometric similarity of binding site
- ROC:
-
Receiver operating characteristic
- PDB:
-
Protein data bank
- BSSF:
-
Binding site similarity and function
- APF:
-
Atomic property fields
- 2D:
-
2 Dimensions
- TIPSA:
-
Triangulation-based Iterative-closest-point for protein surface alignment
- ACT:
-
Acetate ion
- BLOSUM:
-
BLOcks substitution matrix
- CC:
-
Correlation coefficient
- ECDF:
-
Empirical cumulative distribution function
- AA:
-
Amino acid
- AUC:
-
Area under curve
- 3D:
-
3 Dimensions
- BLAST:
-
Basic local alignment search tool
- PLM:
-
Palmitic acid
- GPDH:
-
Glycerol-3-phosphate dehydrogenase
- AMP:
-
Adenosine monophosphate
- ATP:
-
Adenosine-5′-triphosphate
- NAD:
-
Nicotinamide-adenine-dinucleotide
- AND:
-
Adenosine
- GAL:
-
Beta-d-galactose
- RTL:
-
Retinol
- FAD:
-
Flavin-adenine dinucleotide
References
Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Wessig H, Shindyalov LN, Bourne PE (2000) Nucl Acids Res 28:235–242
Blake JD, Cohen FE (2001) J Mol Biol 307:721–735
Chikhi R, Sael L, Kihara D (2010) Proteins 78:2007–2028
Das S, Kokardekar A, Breneman CM (2009) J Chem Inf Model 49:2863–2872
Ellingson L, Zhang J (2012) PLoS ONE 7:e40540
Henikoff S, Henikoff JG (1992) Proc Natl Acad Sci USA 89:10915–10919
Huang B, Schroeder M (2006) BMC Struct Biol 6:19
Humphrey W, Dalke A, Schulten K (1996) J Mol Graph 14:33–38
Kahraman A, Morris RJ, Laskowski RA, Favia AD, Thornton JM (2010) Proteins 78:1120–1136
Kahraman A, Morris RJ, Laskowski RA, Thornton JM (2007) J Mol Biol 368:283–301
Kinning SL, Jackson RM (2009) J Chem Inf Model 49:318–329
Kuhn HW (1955) Naval Res Logist Q 2:83–97
Morris RJ, Najmanovich RJ, Kahraman A, Thornton JM (2005) Bioinformatics 21:2347–2355
Najmanovich R, Kurbatova N, Thornton J (2008) Bioinformatics 24:i105–i111
Osada R, Funkhouser T, Chazelle B, Dobkin D (2002) ACM Trans Graphics 21:807–832
Sael L, Kihara D (2012) Proteins 80:1177–1195
Schalon C, Surgand JS, Kellenberger E, Rognan D (2008) Proteins 71:1755–1778
Tan YH, Huang H, Kihara D (2006) Proteins 64:587–600
Tatusova TA, Madden TL (1994) FEMS Microbiol Lett 174:247–250
Totrov M (2011) BMC Bioinf 12(S1):S35
Xiong B, Wu J, Burk DL, Xue M, Jiang H, Shen J (2010) BMC Bioinformatics 11:47
Acknowledgments
The authors thank the anonymous reviewers for their criticism and valuable suggestions. This work was supported by the National 973 Key Basic Research Program (Grant Nos. 2012CB017062 and 2010CB945401), the National Natural Science Foundation of China (Grant No. 31240038, 31071162 and 31000590) and the Science and Technology Commission of Shanghai Municipality (11DZ2260300).
Conflict of interest
None.
Author information
Authors and Affiliations
Corresponding author
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Tu, H., Shi, T. Ligand Binding Site Similarity Identification Based on Chemical and Geometric Similarity. Protein J 32, 373–385 (2013). https://doi.org/10.1007/s10930-013-9494-1
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10930-013-9494-1