Advertisement

The Protein Journal

, Volume 32, Issue 5, pp 373–385 | Cite as

Ligand Binding Site Similarity Identification Based on Chemical and Geometric Similarity

  • Haibo Tu
  • Tieliu Shi
Article

Abstract

The similarity comparison of binding sites based on amino acid between different proteins can facilitate protein function identification. However, Binding site usually consists of several crucial amino acids which are frequently dispersed among different regions of a protein and consequently make the comparison of binding sites difficult. In this study, we introduce a new method, named as chemical and geometric similarity of binding site (CGS-BSite), to compute the ligand binding site similarity based on discrete amino acids with maximum-weight bipartite matching algorithm. The principle of computing the similarity is to find a Euclidean Transformation which makes the similar amino acids approximate to each other in a geometry space, and vice versa. CGS-BSite permits site and ligand flexibilities, provides a stable prediction performance on the flexible ligand binding sites. Binding site prediction on three test datasets with CGS-BSite method has similar performance to Patch-Surfer method but outperforms other five tested methods, reaching to 0.80, 0.71 and 0.85 based on the area under the receiver operating characteristic curve, respectively. It performs a marginally better than Patch-Surfer on the binding sites with small volume and higher hydrophobicity, and presents good robustness to the variance of the volume and hydrophobicity of ligand binding sites. Overall, our method provides an alternative approach to compute the ligand binding site similarity and predict potential special ligand binding sites from the existing ligand targets based on the target similarity.

Keywords

Binding site Similarity Geometric Prediction 

Abbreviations

CGS-BSite

Chemical and geometric similarity of binding site

ROC

Receiver operating characteristic

PDB

Protein data bank

BSSF

Binding site similarity and function

APF

Atomic property fields

2D

2 Dimensions

TIPSA

Triangulation-based Iterative-closest-point for protein surface alignment

ACT

Acetate ion

BLOSUM

BLOcks substitution matrix

CC

Correlation coefficient

ECDF

Empirical cumulative distribution function

AA

Amino acid

AUC

Area under curve

3D

3 Dimensions

BLAST

Basic local alignment search tool

PLM

Palmitic acid

GPDH

Glycerol-3-phosphate dehydrogenase

AMP

Adenosine monophosphate

ATP

Adenosine-5′-triphosphate

NAD

Nicotinamide-adenine-dinucleotide

AND

Adenosine

GAL

Beta-d-galactose

RTL

Retinol

FAD

Flavin-adenine dinucleotide

Notes

Acknowledgments

The authors thank the anonymous reviewers for their criticism and valuable suggestions. This work was supported by the National 973 Key Basic Research Program (Grant Nos. 2012CB017062 and 2010CB945401), the National Natural Science Foundation of China (Grant No. 31240038, 31071162 and 31000590) and the Science and Technology Commission of Shanghai Municipality (11DZ2260300).

Conflict of interest

None.

Supplementary material

10930_2013_9494_MOESM1_ESM.docx (33 kb)
Supplementary material 1 (DOCX 34 kb)

References

  1. 1.
    Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Wessig H, Shindyalov LN, Bourne PE (2000) Nucl Acids Res 28:235–242CrossRefGoogle Scholar
  2. 2.
    Blake JD, Cohen FE (2001) J Mol Biol 307:721–735CrossRefGoogle Scholar
  3. 3.
    Chikhi R, Sael L, Kihara D (2010) Proteins 78:2007–2028CrossRefGoogle Scholar
  4. 4.
    Das S, Kokardekar A, Breneman CM (2009) J Chem Inf Model 49:2863–2872CrossRefGoogle Scholar
  5. 5.
    Ellingson L, Zhang J (2012) PLoS ONE 7:e40540CrossRefGoogle Scholar
  6. 6.
    Henikoff S, Henikoff JG (1992) Proc Natl Acad Sci USA 89:10915–10919CrossRefGoogle Scholar
  7. 7.
    Huang B, Schroeder M (2006) BMC Struct Biol 6:19CrossRefGoogle Scholar
  8. 8.
    Humphrey W, Dalke A, Schulten K (1996) J Mol Graph 14:33–38CrossRefGoogle Scholar
  9. 9.
    Kahraman A, Morris RJ, Laskowski RA, Favia AD, Thornton JM (2010) Proteins 78:1120–1136CrossRefGoogle Scholar
  10. 10.
    Kahraman A, Morris RJ, Laskowski RA, Thornton JM (2007) J Mol Biol 368:283–301CrossRefGoogle Scholar
  11. 11.
    Kinning SL, Jackson RM (2009) J Chem Inf Model 49:318–329CrossRefGoogle Scholar
  12. 12.
    Kuhn HW (1955) Naval Res Logist Q 2:83–97CrossRefGoogle Scholar
  13. 13.
    Morris RJ, Najmanovich RJ, Kahraman A, Thornton JM (2005) Bioinformatics 21:2347–2355CrossRefGoogle Scholar
  14. 14.
    Najmanovich R, Kurbatova N, Thornton J (2008) Bioinformatics 24:i105–i111CrossRefGoogle Scholar
  15. 15.
    Osada R, Funkhouser T, Chazelle B, Dobkin D (2002) ACM Trans Graphics 21:807–832CrossRefGoogle Scholar
  16. 16.
    Sael L, Kihara D (2012) Proteins 80:1177–1195CrossRefGoogle Scholar
  17. 17.
    Schalon C, Surgand JS, Kellenberger E, Rognan D (2008) Proteins 71:1755–1778CrossRefGoogle Scholar
  18. 18.
    Tan YH, Huang H, Kihara D (2006) Proteins 64:587–600CrossRefGoogle Scholar
  19. 19.
    Tatusova TA, Madden TL (1994) FEMS Microbiol Lett 174:247–250CrossRefGoogle Scholar
  20. 20.
    Totrov M (2011) BMC Bioinf 12(S1):S35Google Scholar
  21. 21.
    Xiong B, Wu J, Burk DL, Xue M, Jiang H, Shen J (2010) BMC Bioinformatics 11:47CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2013

Authors and Affiliations

  1. 1.Center for Bioinformatics and Computational Biology, Shanghai Key Laboratory of Regulatory BiologyThe Institute of Biomedical Sciences and School of Life Sciences, East China Normal UniversityShanghaiChina

Personalised recommendations