Survey of Computational Approaches for Prediction of DNA-Binding Residues on Protein Surfaces

  • Yi Xiong
  • Xiaolei Zhu
  • Hao Dai
  • Dong-Qing Wei
Part of the Methods in Molecular Biology book series (MIMB, volume 1754)


The increasing number of protein structures with uncharacterized function necessitates the development of in silico prediction methods for functional annotations on proteins. In this chapter, different kinds of computational approaches are briefly introduced to predict DNA-binding residues on surface of DNA-binding proteins, and the merits and limitations of these methods are mainly discussed. This chapter focuses on the structure-based approaches and mainly discusses the framework of machine learning methods in application to DNA-binding prediction task.

Key words

Structure-based function prediction Functional annotation DNA-binding residue Machine learning method 



This work was supported by the grants from National Natural Science Foundation of China for Young Scholars (Grant No. 31601074 and 21403002), the funding from National Key Research Program (Contract No. 2016YFA0501703), and the Open Fund of Shanghai Key Laboratory of Intelligent Information Processing (Contract No. IIPL-2016-005).


  1. 1.
    Luscombe NM, Austin SE, Berman HM, Thornton JM (2000) An overview of the structures of protein-DNA complexes. Genome Biol 1(1):REVIEWS001CrossRefGoogle Scholar
  2. 2.
    Biswas S, Guharoy M, Chakrabarti P (2009) Dissection, residue conservation, and structural classification of protein-DNA interfaces. Protein Struct Funct Bioinformatics 74(3):643–654CrossRefGoogle Scholar
  3. 3.
    Ahmad S, Keskin O, Sarai A, Nussinov R (2008) Protein-DNA interactions: structural, thermodynamic and clustering patterns of conserved residues in DNA-binding proteins. Nucleic Acids Res 36(18):5922–5932CrossRefGoogle Scholar
  4. 4.
    Zhao H, Yang Y, Zhou Y (2010) Structure-based prediction of DNA-binding proteins by structural alignment and a volume-fraction corrected DFIRE-based energy function. Bioinformatics 26(15):1857–1863CrossRefGoogle Scholar
  5. 5.
    Gao M, Skolnick J (2008) DBD-Hunter: a knowledge-based method for the prediction of DNA-protein interactions. Nucleic Acids Res 36(12):3978–3992CrossRefGoogle Scholar
  6. 6.
    Jones S, Barker JA, Nobeli I, Thornton JM (2003) Using structural motif templates to identify proteins with DNA binding function. Nucleic Acids Res 31(11):2811–2823CrossRefGoogle Scholar
  7. 7.
    Gao M, Skolnick J (2009) A threading-based method for the prediction of DNA-binding proteins with application to the human genome. PLoS Comput Biol 5(11):e1000567CrossRefGoogle Scholar
  8. 8.
    Gherardini PF, Helmer-Citterich M (2008) Structure-based function prediction: approaches and applications. Brief Funct Genomic Proteomic 7(4):291–302CrossRefGoogle Scholar
  9. 9.
    Nimrod G, Szilagyi A, Leslie C, Ben-Tal N (2009) Identification of DNA-binding proteins using structural, electrostatic and evolutionary features. J Mol Biol 387(4):1040–1053CrossRefGoogle Scholar
  10. 10.
    Ahmad S, Sarai A (2004) Moment-based prediction of DNA-binding proteins. J Mol Biol 341(1):65–71CrossRefGoogle Scholar
  11. 11.
    Liu B, Wang S, Wang X (2015) DNA binding protein identification by combining pseudo amino acid composition and profile-based protein representation. Sci Rep 5:15479CrossRefGoogle Scholar
  12. 12.
    Miao Z, Westhof E (2015) A large-scale assessment of nucleic acids binding site prediction programs. PLoS Comput Biol 11(12):e1004639CrossRefGoogle Scholar
  13. 13.
    Gromiha MM, Fukui K (2011) Scoring function based approach for locating binding sites and understanding recognition mechanism of protein-DNA complexes. J Chem Inf Model 51(3):721–729CrossRefGoogle Scholar
  14. 14.
    Liu R, Hu J (2013) DNABind: a hybrid algorithm for structure-based prediction of DNA-binding residues by combining machine learning- and template-based approaches. Proteins 81(11):1885–1899CrossRefGoogle Scholar
  15. 15.
    Zen A, de Chiara C, Pastore A, Micheletti C (2009) Using dynamics-based comparisons to predict nucleic acid binding sites in proteins: an application to OB-fold domains. Bioinformatics 25(15):1876–1883CrossRefGoogle Scholar
  16. 16.
    Gao M, Skolnick J (2009) From nonspecific DNA-protein encounter complexes to the prediction of DNA-protein interactions. PLoS Comput Biol 5(3):e1000341CrossRefGoogle Scholar
  17. 17.
    Maetschke SR, Yuan Z (2009) Exploiting structural and topological information to improve prediction of RNA-protein binding sites. BMC Bioinformatics 10:341CrossRefGoogle Scholar
  18. 18.
    Xiong Y, Xia J, Zhang W, Liu J (2011) Exploiting a reduced set of weighted average features to improve prediction of DNA-binding residues from 3D structures. PLoS One 6(12):e28440CrossRefGoogle Scholar
  19. 19.
    Zhou J, Xu R, He Y, Lu Q, Wang H, Kong B (2016) PDNAsite: identification of DNA-binding site from protein sequence by incorporating spatial and sequence context. Sci Rep 6:27653CrossRefGoogle Scholar
  20. 20.
    Yan J, Friedrich S, Kurgan L (2016) A comprehensive comparative review of sequence-based predictors of DNA- and RNA-binding residues. Brief Bioinform 17(1):88–105CrossRefGoogle Scholar
  21. 21.
    Peng Z, Kurgan L (2015) High-throughput prediction of RNA, DNA and protein binding regions mediated by intrinsic disorder. Nucleic Acids Res 43(18):e121CrossRefGoogle Scholar
  22. 22.
    Si J, Zhang Z, Lin B, Schroeder M, Huang B (2011) MetaDBSite: a meta approach to improve protein DNA-binding sites prediction. BMC Syst Biol 5(Suppl 1):S7CrossRefGoogle Scholar
  23. 23.
    Wang L, Huang C, Yang MQ, Yang JY (2010) BindN+ for accurate prediction of DNA and RNA-binding residues from protein sequence features. BMC Syst Biol 4(Suppl 1):S3CrossRefGoogle Scholar
  24. 24.
    Cai Y, He Z, Shi X, Kong X, Gu L, Xie L (2010) A novel sequence-based method of predicting protein DNA-binding residues, using a machine learning approach. Mol Cells 30(2):99–105CrossRefGoogle Scholar
  25. 25.
    JS W, Liu HD, Duan XY, Ding Y, HT W, Bai YF, Sun X (2009) Prediction of DNA-binding residues in proteins from amino acid sequences using a random forest model with a hybrid feature. Bioinformatics 25(1):30–35CrossRefGoogle Scholar
  26. 26.
    Wang L, Yang MQ, Yang JY (2009) Prediction of DNA-binding residues from protein sequence information using random forests. BMC Genomics 10(Suppl 1):S1CrossRefGoogle Scholar
  27. 27.
    Badis G, Berger MF, Philippakis AA, Talukder S, Gehrke AR, Jaeger SA, Chan ET, Metzler G, Vedenko A, Chen X, Kuznetsov H, Wang CF, Coburn D, Newburger DE, Morris Q, Hughes TR, Bulyk ML (2009) Diversity and complexity in DNA recognition by transcription factors. Science 324(5935):1720–1723CrossRefGoogle Scholar
  28. 28.
    Ofran Y, Mysore V, Rost B (2007) Prediction of DNA-binding residues from sequence. Bioinformatics 23(13):I347–I353CrossRefGoogle Scholar
  29. 29.
    Hwang S, Gou ZK, Kuznetsov IB (2007) DP-Bind: a Web server for sequence-based prediction of DNA-binding residues in DNA-binding proteins. Bioinformatics 23(5):634–636CrossRefGoogle Scholar
  30. 30.
    Ho SY, FC Y, Chang CY, Huang HL (2007) Design of accurate predictors for DNA-binding sites in proteins using hybrid SVM-PSSM method. Biosystems 90(1):234–241CrossRefGoogle Scholar
  31. 31.
    Wang LJ, Brown SJ (2006) BindN: a web-based tool for efficient prediction of DNA and RNA binding sites in amino acid sequences. Nucleic Acids Res 34:W243–W248CrossRefGoogle Scholar
  32. 32.
    Wang L, Brown SJ (2006) Prediction of DNA-binding residues from sequence features. J Bioinform Comput Biol 4(6):1141–1158CrossRefGoogle Scholar
  33. 33.
    Ahmad S, Gromiha MM, Sarai A (2004) Analysis and prediction of DNA-binding proteins and their binding residues based on composition, sequence and structural information. Bioinformatics 20(4):477–486CrossRefGoogle Scholar
  34. 34.
    Yan J, Kurgan L (2017) DRNApred, fast sequence-based method that accurately predicts and discriminates DNA- and RNA-binding residues. Nucleic Acids Res 45(10):e84PubMedPubMedCentralGoogle Scholar
  35. 35.
    Ahmad S, Sarai A (2005) PSSM-based prediction of DNA binding sites in proteins. BMC Bioinformatics 6:33CrossRefGoogle Scholar
  36. 36.
    Zhu X, Ericksen SS, Mitchell JC (2013) DBSI: DNA-binding site identifier. Nucleic Acids Res 41(16):e160CrossRefGoogle Scholar
  37. 37.
    Tsuchiya Y, Kinoshita K, Nakamura H (2004) Structure-based prediction of DNA-binding sites on proteins using the empirical preference of electrostatic potential and the shape of molecular surfaces. Protein Struct Funct Bioinformatics 55(4):885–894CrossRefGoogle Scholar
  38. 38.
    Chen YC, CY W, Lim C (2007) Predicting DNA-binding amino acid residues from electrostatic stabilization upon mutation to Asp/Glu and evolutionary conservation. Proteins 67(3):671–680CrossRefGoogle Scholar
  39. 39.
    Bhardwaj N, Lu H (2007) Residue-level prediction of DNA-binding sites and its application on DNA-binding protein predictions. FEBS Lett 581(5):1058–1066CrossRefGoogle Scholar
  40. 40.
    Zhou W, Yan H (2010) A discriminatory function for prediction of protein-DNA interactions based on alpha shape modeling. Bioinformatics 26(20):2541–2548CrossRefGoogle Scholar
  41. 41.
    Zhou P, Tian F, Ren Y, Shang Z (2010) Systematic classification and analysis of themes in protein-DNA recognition. J Chem Inf Model 50(8):1476–1488CrossRefGoogle Scholar
  42. 42.
    Sonavane S, Chakrabarti P (2009) Cavities in protein-DNA and protein-RNA interfaces. Nucleic Acids Res 37(14):4613–4620CrossRefGoogle Scholar
  43. 43.
    Xiong Y, Liu J, Wei DQ (2011) An accurate feature-based method for identifying DNA-binding residues on protein surfaces. Proteins 79(2):509–517CrossRefGoogle Scholar
  44. 44.
    Tjong H, Zhou HX (2007) DISPLAR: an accurate method for predicting DNA-binding sites on protein surfaces. Nucleic Acids Res 35(5):1465–1477CrossRefGoogle Scholar
  45. 45.
    Jones S, Shanahan HP, Berman HM, Thornton JM (2003) Using electrostatic potentials to predict DNA-binding sites on DNA-binding proteins. Nucleic Acids Res 31(24):7189–7198CrossRefGoogle Scholar
  46. 46.
    Dai H, Xu Q, Xiong Y, Liu WL, Wei DQ (2016) Improved prediction of michaelis constants in CYP450-mediated reactions by resilient back propagation algorithm. Curr Drug Metab 17(7):673–680CrossRefGoogle Scholar
  47. 47.
    Yao Y, Zhang T, Xiong Y, Li L, Huo J, Wei DQ (2011) Mutation probability of cytochrome P450 based on a genetic algorithm and support vector machine. Biotechnol J 6(11):1367–1376CrossRefGoogle Scholar
  48. 48.
    Xiong Y, Liu J, Zhang W, Zeng T (2012) Prediction of heme binding residues from protein sequences with integrative sequence profiles. Proteome Sci 10(Suppl 1):S20CrossRefGoogle Scholar
  49. 49.
    Li L, Xiong Y, Zhang ZY, Guo Q, Xu Q, Liow HH, Zhang YH, Wei DQ (2015) Improved feature-based prediction of SNPs in human cytochrome P450 enzymes. Interdiscip Sci Comput Life Sci 7(1):65–77CrossRefGoogle Scholar
  50. 50.
    Zhang W, Xiong Y, Zhao M, Zou H, Ye X, Liu J (2011) Prediction of conformational B-cell epitopes from 3D structures by random forests with a distance-based feature. BMC Bioinformatics 12:341CrossRefGoogle Scholar
  51. 51.
    Xu Q, Xiong Y, Dai H, Kumari KM, Xu Q, HY O, Wei DQ (2017) PDC-SGB: prediction of effective drug combinations using a stochastic gradient boosting algorithm. J Theor Biol 417:1–7CrossRefGoogle Scholar
  52. 52.
    Sun Y, Xiong Y, Xu Q, Wei D (2014) A hadoop-based method to predict potential effective drug combination. Biomed Res Int 2014:196858PubMedPubMedCentralGoogle Scholar
  53. 53.
    Tsuchiya Y, Kinoshita K, Nakamura H (2005) PreDs: a server for predicting dsDNA-binding site on protein molecular surfaces. Bioinformatics 21(8):1721–1723CrossRefGoogle Scholar
  54. 54.
    Ozbek P, Soner S, Erman B, Haliloglu T (2010) DNABINDPROT: fluctuation-based predictor of DNA-binding residues within a network of interacting residues. Nucleic Acids Res 38(Web Server issue):W417–W423CrossRefGoogle Scholar
  55. 55.
    Sukumar S, Zhu X, Ericksen SS, Mitchell JC (2016) DBSI server: DNA binding site identifier. Bioinformatics 32(18):2853–2855CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.School of Life Sciences and BiotechnologyShanghai Jiao Tong UniversityShanghaiChina

Personalised recommendations