Skip to main content

Predicting Protein-Protein Interactions with K-Nearest Neighbors Classification Algorithm

  • Conference paper
Computational Intelligence Methods for Bioinformatics and Biostatistics (CIBB 2009)

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 6160))

Abstract

In this work we address the problem of predicting protein-protein interactions. Its solution can give greater insight in the study of complex diseases, like cancer, and provides valuable information in the study of active small molecules for new drugs, limiting the number of molecules to be tested in laboratory. We model the problem as a binary classification task, using a suitable coding of the amino acid sequences. We apply k-Nearest Neighbors classification algorithm to the classes of interacting and noninteracting proteins. Results show that it is possible to achieve high prediction accuracy in cross validation. A case study is analyzed to show it is possible to reconstruct a real network of thousands interacting proteins with high accuracy on standard hardware.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. De Las Rivas, J., de Luis, A.: Interactome data and databases: different types of protein interaction: Conference reviews. Comp. Funct. Genomics 5(2), 173–178 (2004)

    Article  Google Scholar 

  2. Nooren, I.M., Thornton, J.M.: Diversity of protein-protein interactions. EMBO J. 22(14), 3486–3492 (2003)

    Article  Google Scholar 

  3. Grigoriev, A.: On the number of protein-protein interactions in the yeast proteome. Nucleic Acids Research 31, 4157–4161 (2003)

    Article  Google Scholar 

  4. Xenarios, I., Rice, D., Salwinski, L., Baron, M., Marcotte, E., Eisenberg, D.: Dip: the database of interacting proteins. Nucleic Acids Research 28(1), 289–291 (2000)

    Article  Google Scholar 

  5. Walker-Taylor, A., Jones, D.: Computational methods for predicting protein protein interactions. In: Waksman, G. (ed.) Proteomics and protein-protein interactions: biology, chemistry, bioinformatics, and drug design, pp. 89–114. Springer, Heidelberg (2005)

    Google Scholar 

  6. Shoemaker, B., Panchenko, A.: Deciphering protein–protein interactions - part ii. computational methods to predict protein and domain interaction partners. PLoS Computational Biology 3(4), 595–601 (2007)

    Article  Google Scholar 

  7. Shi, T.L., Li, Y.X., Cai, Y.D., Chou, K.C.: Computational methods for protein-protein interaction and their application. Curr. Protein Pept Sci. 6(5), 443–449 (2005)

    Article  Google Scholar 

  8. Pitre, S., Alamgir, M., Green, J., Dumontier, M., Dehne, F., Golshani, A.: Computational Methods for Predicting Protein-Protein Interactions. In: The Adaption of Virtual Man-Computer Interfaces to User Requirements in Dialogs, vol. 110, pp. 247–267. Springer, Berlin (2008)

    Google Scholar 

  9. Mathivanan, S., Periaswamy, B., Gandhi, T.K.B., Kandasamy, K., Suresh, S., Mohmood, R., Ramachandra, Y.L., Pandey, A.: An evaluation of human protein-protein interaction data in the public domain. BMC Bioinformatics 7(Suppl. 5) (2006)

    Google Scholar 

  10. Mewes, H.W., Dietmann, S., Frishman, D., Gregory, R., Mannhaupt, G., Mayer, K.F.X., Münsterkötter, M., Ruepp, A., Spannagl, M., Stümpflen, V., Rattei, T.: Mips: analysis and annotation of genome information in 2007. Nucleic Acids Research 36(Database-Issue), 196–201 (2008)

    Google Scholar 

  11. Stark, C., Breitkreutz, B.J., Reguly, T., Boucher, L., Breitkreutz, A., Tyers, M.: Biogrid: a general repository for interaction datasets. Nucleic Acids Research 34(Database issue) (January 2006)

    Google Scholar 

  12. Chatr-aryamontri, A., Ceol, A., Palazzi, L.M.M., Nardelli, G., Schneider, M.V.V., Castagnoli, L., Cesareni, G.: Mint: the molecular interaction database. Nucleic Acids Research 35(Database issue), D572–D574 (2007)

    Article  Google Scholar 

  13. Brown, K.R., Jurisica, I.: Online predicted human interaction database. Bioinformatics 21(9), 2076–2082 (2005)

    Article  Google Scholar 

  14. Prasad, K.T.S., Goel, R., Kandasamy, K., Keerthikumar, S., Kumar, S., Mathivanan, S., Telikicherla, D., Raju, R., Shafreen, B., Venugopal, A., Balakrishnan, L., Marimuthu, A., Banerjee, S., Somanathan, D.S., Sebastian, A., Rani, S., Ray, S., Kishore, H.C.J., Kanth, S., Ahmed, M., Kashyap, M.K., Mohmood, R., Ramachandra, Y.L., Krishna, V., Rahiman, A.B., Mohan, S., Ranganathan, P., Ramabadran, S., Chaerkady, R., Pandey, A.: Human protein reference database–2009 update. Nucleic Acids Research 37(Database issue), gkn892+ (2009)

    Google Scholar 

  15. Shen, J., Zhang, J., Luo, X., Zhu, W., Yu, K., Chen, K., Li, Y., Jiang, H.: Predicting protein-protein interactions based only on sequences information. PNAS 104(11), 4337–4341 (2007)

    Article  Google Scholar 

  16. Bock, J.R., Gough, D.A.: Predicting protein–protein interactions from primary structure. Bioinformatics 17(5), 455–460 (2001)

    Article  Google Scholar 

  17. Nanni, L.: Hyperplanes for predicting protein-protein interactions. Neurocomputing 69(1-3), 257–263 (2005)

    Article  Google Scholar 

  18. Mitchell, T.M.: Machine Learning. McGraw-Hill, New York (1997)

    MATH  Google Scholar 

  19. Vapnik, V.: The Nature of Statistical Learning Theory. Springer, Heidelberg (1995)

    MATH  Google Scholar 

  20. Guarracino, M., Cuciniello, S., Feminiano, D., Toraldo, G., Pardalos, P.: Current classification algorithms for biomedical applications. Centre de Recherches Mathématiques CRM Proceedings & Lecture Notes of the American Mathematical Society 45(2), 109–126 (2008)

    MathSciNet  Google Scholar 

  21. Platt, J.: Fast training of SVMs using sequential minimal optimization. In: Advances in Kernel Methods: Support Vector Learning, pp. 185–208. MIT Press, Cambridge (1999)

    Google Scholar 

  22. Costantini, S., Facchiano, A.M.: Prediction of the protein structural class by specific peptide frequencies. Biochimie 1-4 (2008)

    Google Scholar 

  23. Hur, A.B., Noble, W.: Choosing negative examples for the prediction of protein-protein interactions. BMC Bioinformatics 7(Suppl. 1) (2006)

    Google Scholar 

  24. Shi, M.G., Xia, J.F., Li, X.L.: Predicting protein–protein interactions from sequence using correlation coefficient and high-quality interaction dataset. Amino Acids (2009) (online)

    Google Scholar 

  25. Bell, R., Hubbard, A., Chettier, R., Chen, D., Miller, J.P., Kapahi, P., Tarnopolsky, M., Sahasrabuhde, S., Melov, S., Hughes, R.E.: A human protein interaction network shows conservation of aging processes between human and invertebrate species. Plos Genetics 5(3) (2009)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Guarracino, M.R., Nebbia, A. (2010). Predicting Protein-Protein Interactions with K-Nearest Neighbors Classification Algorithm. In: Masulli, F., Peterson, L.E., Tagliaferri, R. (eds) Computational Intelligence Methods for Bioinformatics and Biostatistics. CIBB 2009. Lecture Notes in Computer Science(), vol 6160. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-14571-1_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-14571-1_10

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-14570-4

  • Online ISBN: 978-3-642-14571-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics