The Protein Journal

, Volume 28, Issue 2, pp 111–115 | Cite as

Prediction of Interacting Protein Pairs from Sequence Using a Bayesian Method

  • Chishe Wang
  • Jiaxing Cheng
  • Shoubao Su


With the development of bioinformatics, more and more protein sequence information has become available. Meanwhile, the number of known protein–protein interactions (PPIs) is still very limited. In this article, we propose a new method for predicting interacting protein pairs using a Bayesian method based on a new feature representation. We trained our model using data on 6,459 PPI pairs from the yeast Saccharomyces cerevisiae core subset. Using six species of DIP database, our model demonstrates an average prediction accuracy of 93.67%. The result showed that our method is superior to other methods in both computing time and prediction accuracy.


Protein–protein interactions Feature vector Bayesian method Amino acid composition 



Protein–protein interactions


Database of interacting proteins


Protein data bank


Support vector machine


Expression profile reliability


Paralogous verification method


True positive


True negative


False positive


False negative


Receiver operating characteristics


The area under the curve



This work was supported partially by the Project of Provincial Natural Scientific Fund from the Bureau of Education of AnHui Provience(Nos. KJ2007B066, KJ2007A087).


  1. 1.
    Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, Shindyalov IN, Boume PE (2000) Nucleic Acids Res 28(1):235–242CrossRefGoogle Scholar
  2. 2.
    Bock JR, Gough DA (2001) Bioinformatics 17(6):455–460CrossRefGoogle Scholar
  3. 3.
    Chou KC (2001) Proteins: Structure Funct Genet 43(3):246–255CrossRefGoogle Scholar
  4. 4.
    Cooper GF, Herskovits E (1992) Mach Learn 9(4):309–347Google Scholar
  5. 5.
    Fukuhara N, Kawabata T (2008) Nucleic Acids Res 36:W185–W189CrossRefGoogle Scholar
  6. 6.
    Guo YZ, Yu LZ, Wen ZN, Li ML (2008) Nucleic Acids Res 36(9):3025–3030CrossRefGoogle Scholar
  7. 7.
    Hopp TP, Woods KR (1981) Proc Natl Acad Sci USA 78:3824–3828CrossRefGoogle Scholar
  8. 8.
    Jansen R, Yu HY, Greenbaum D, Kluger Y, Krogan NJ, Chung SB, Emili A, Snyder M, Greenblatt JF, Gerstein M (2003) Science 302(5644):449–453CrossRefGoogle Scholar
  9. 9.
    Jiang MH, Anderson J, Gillespie J, Mayne M (2008) BMC Bioinformatics 9(192):1–11CrossRefGoogle Scholar
  10. 10.
    Krigbaum WR, Komoriya A (1979) Biochim Biophys Acta 576(1):204–228Google Scholar
  11. 11.
    Kandel D, Matias Y, Unger R, Winkler P (1996) Discrete Appl Math 71:171–185CrossRefGoogle Scholar
  12. 12.
    Lo SL, Cai CZ, Chen YZ, Chung MCM (2005) Proteomics 5(4):876–884CrossRefGoogle Scholar
  13. 13.
    Skrabanek L, Saini HK, Bader GD, Enright AJ (2008) Mol Biotechnol 38(1):1–17CrossRefGoogle Scholar
  14. 14.
    Tanford C (1962) J Am Chem Soc 84:4240–4274CrossRefGoogle Scholar
  15. 15.
    Wang CS, Cheng JX, Su SB, Xu DZ (2008) ADMA 2008, LNAI 5139:207–216Google Scholar
  16. 16.
    Witten IH, Frank E (2005) Data mining: practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann, San FranciscoGoogle Scholar
  17. 17.
    Zhou HX, Qin SB (2007) Bioinformatics 23(17):2203–2209CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2009

Authors and Affiliations

  1. 1.Key Laboratory of Intelligent Computing & Signal Processing, Ministry of EducationAnHui UniversityHefeiChina
  2. 2.Department of Computer Science and TechnologyChaohu CollegeChaohuChina

Personalised recommendations