Protein Function Prediction Based on Patterns in Biological Networks

  • Mustafa Kirac
  • Gultekin Ozsoyoglu
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4955)


In this paper, we propose a pattern-based protein function annotation framework, employing protein interaction networks, to predict annotation functions of proteins. More specifically, we first detect patterns that appear in the neighborhood of proteins with a particular functionality, and then transfer annotations between two proteins only if they have similar annotation patterns. We show that, in comparison with other techniques, our approach predicts protein annotations more effectively. Our technique (a) produces the highest prediction accuracy of 70-80% precision and recall for different organism specific datasets, and (b) is robust to false positives in protein interaction networks.


Biological Network Protein Interaction Network Alignment Score Protein Function Prediction Protein Interaction Data 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Bader, G.D., Hogue, C.W.: Analyzing yeast protein–protein interaction data obtained from different sources. Nat. Biotechnol. 20, 991–997 (2002) Google Scholar
  2. Bader, G.D., Hogue, C.W.V.: An automated method for finding molecular complexes in large protein interaction networks. BMC Bioinformatics 4, 2 (2003) Google Scholar
  3. Brohée, S., van Helden, J.: Evaluation of clustering algorithms for protein-protein interaction networks. BMC Bioinformatics 7, 488 (2006) Google Scholar
  4. Berg, J., Lässig, M.: Local graph alignment and motif search in biological networks. PNAS 101, 14689–14694 (2004) Google Scholar
  5. Berg, J., Lässig, M.: Cross-species analysis of biological networks by Bayesian alignment. PNAS 103, 10967–10972 (2006) Google Scholar
  6. Blatt, M., Wiseman, S., Domany, E.: Superparamagnetic clustering of data. Phys. Rev. Lett. 76(18), 3251–3254 (1996) Google Scholar
  7. Cakmak, A., Kirac, M., Reynolds, M.R., Ozsoyoglu, Z.M., Ozsoyoglu, G.: Gene Ontology-Based Annotation Analysis and Categorization of Metabolic Pathways. SSDBM 33 (2007) Google Scholar
  8. Couto, F., Silva, M., Coutinho, P.: Measuring Semantic Similarity between Gene Ontology Terms. DKE 61, 137–152 (2007) Google Scholar
  9. Chua, H.N., Sung, W.K., Wong, L.: Using indirect protein interactions for the prediction of Gene Ontology functions. BMC Bioinformatics 8(Suppl 4), 8 (2007) Google Scholar
  10. Van Dongen, S.: Graph clustering by flow simulation. PhD thesis Centers for mathematics and computer science (CWI), University of Utrecht (2000) Google Scholar
  11. Deng, M., Tu, Z., Sun, F., Chen, T.: Mapping gene ontology to proteins based on protein–protein interaction data. Bioinformatics 20, 895–902 (2004) Google Scholar
  12. Friedberg, I.: Automated protein function prediction—the genomic challenge. Briefings in Bioinformatics 7(3), 225–242 (2006) Google Scholar
  13. Consortium, Gene Ontology: The GO database and informatics resource. Nucleic Acids Res, 32, D258-D261 (2004) Google Scholar
  14. Goldberg, D.S., Roth, F.: Assessing experimentally derived interactions in a small world. PNAS 100(8), 4372–4376 (2003) Google Scholar
  15. Hishigaki, H., Nakai, K., Ono, T., Tanigami, A., Takagi, T.: Assessment of prediction accuracy of protein function from protein–protein interaction data. Yeast 18, 523–531 (2001) Google Scholar
  16. Hu, H., Yan, X., Huang, Y., Han, J., Zhou, X.J.: Mining coherent dense subgraphs across massive biological networks for functional discovery. Bioinformatics 21(Suppl 1), i213–i221 (2005) Google Scholar
  17. King, O.D., Foulger, R.E., Dwight, S.S., White, J.V., Roth, F.P.: Predicting gene function from patterns of annotation. Genome Res 13(5), 896–904 (2003) Google Scholar
  18. Kitano, H.: Biological Robustness. Nat Genet 5, 826–838 (2004) Google Scholar
  19. Kashima, H., Inokuchi, A.: Kernels for Graph Classification. In: ICDM 2002 (AM-2002) (2002) Google Scholar
  20. Koyutürk, M., Grama, A., Szpankowski, W.: An efficient algorithm for detecting frequent subgraphs in biological networks. Bioinformatics 20(Suppl 1), i200–i207 (2004) Google Scholar
  21. Kirac, M., Ozsoyoglu, G., Yang, J.: Annotating proteins by mining protein interaction networks. Bioinformatics 22, e260–e270 (2006) Google Scholar
  22. King, A.D., Przulj, N., Jurisica, I.: Protein complex prediction via cost-based clustering. Bioinformatics 20(17), 3013–3020 (2004) Google Scholar
  23. Lee, I., Date, S.V., Adai, A.T., Marcotte, E.M.: A Probabilistic Functional Network of Yeast Genes. Science 306(5701), 1555–1558 (2004) Google Scholar
  24. Letovsky, S., Kasif, S.: Predicting protein function from protein–protein interaction data: a probabilistic approach. Bioinformatics 19, i197–i204 (2003) Google Scholar
  25. Lee, et al.: Transcriptional Regulatory Networks in Saccharomyces cerevisiae. Science 298(5594), 799–804 (2002) Google Scholar
  26. von Mering, C., Krause, R., Snel, B., Cornell, M., Oliver, S.G., Fields, S., Bork, P.: Comparative assessment of large-scale data sets of protein–protein interactions. Nature 417, 399–403 (2002) Google Scholar
  27. Milo, R., Shen-Orr, S., Itzkovitz, S., Kashtan, N., Chklovskii, D., Alon, U.: Network Motifs: Simple Building Blocks of Complex Networks. Science 298, 824–827 (2002) Google Scholar
  28. Nabieva, E., Jim, K., Agarwal, A., Chazelle, B.: Whole-proteome prediction of protein function via graphtheoretic analysis of interaction maps. Bioinformatics 21(Suppl. 1), i302–i310 (2005) Google Scholar
  29. Orr, S.S., Milo, R., Mangan, S., Alon, U.: Network motifs in the transcriptional regulation network of Escherichia coli. Nat Genet. 31, 64–68 (2002) Google Scholar
  30. Pandey, J., Koyuturk, M., Kim, Y., Szpankowski, W., Subramaniam, S., Grama, A.: Functional annotation of regulatory pathways. Bioinformatics 23(13), i377–i386 (2007) Google Scholar
  31. Rost, B.: Enzyme function less conserved than anticipated. J Mol. Biol. 318, 595–608 (2002) Google Scholar
  32. Ronald, C.R., Corneil, D.G.: The graph isomorphism disease. Journal of Graph Theory 1(4), 339–363 (1977) Google Scholar
  33. Stark, C., Breitkreutz, B.J., Reguly, T., Boucher, L., Breitkreutz, A., Tyers, M.: BioGRID: a general repository for interaction datasets. Nucleic Acids Res. 34, D535–D539 (2006) Google Scholar
  34. Shaw, W.M.J., Burgin, R., Howell, P.: Performance standards and evaluations in IR test collections: Vector-space and other retrieval models. Info Proc. Manag. 33(1), 15–36 (1997) Google Scholar
  35. Saccharomyces Genome Database (SGD),
  36. Samanta, M.P., Liang, S.: Predicting protein functions from redundancies in large-scale protein interaction networks. PNAS 100, 12579–12583 (2003) Google Scholar
  37. Sharan, R., Suthram, S., Kelley, R.M., Kuhn, T., McCuine, S., Uetz, P., Sittler, T., Karp, R., Ideker, T.: Conserved patterns of protein interaction in multiple species. PNAS 102, 1974–1979 (2005) Google Scholar
  38. Saito, R., Suzuki, H., Hayashizaki, Y.: Construction of reliable protein–protein interaction networks with a new interaction generality measure. Bioinformatics 19(6), 756–763 (2003) Google Scholar
  39. Schwikowski, B., Uetz, P., Fields, S.: A network of protein–protein interactions in yeast. Nat. Biotechnol. 18, 1257–1261 (2000) Google Scholar
  40. Sharan, R., Ulitsky, I., Shamir, R.: Network-based prediction of protein function. Mol. Sys. Bio. 3, 88 (2007) Google Scholar
  41. Troyanskaya, O.G., Dolinski, K., Owen, A.B., Altman, R.B., Botstein, D.: A Bayesian framework for combining heterogeneous data sources for gene function prediction (in Saccharomyces cerevisiae). PNAS 100(14), 8348–8353 (2003) Google Scholar
  42. Tong, et al.: Global Mapping of the Yeast Genetic Interaction Network. Science 303(5659), 808–813 (2004) Google Scholar
  43. Tian, W., Skolnick, J.: How well is enzyme function conserved as a function of pairwise sequence identity? J Mol. Biol. 333, 863–882 (2003) Google Scholar
  44. Valencia, A.: Automatic annotation of protein function. Curr. Opin. Struct. Biol. 15, 267–274 (2005) Google Scholar
  45. Vazquez, A., Flammini, A., Maritan, A., Vespignani, A.: Global protein function prediction from protein–protein interaction networks. Nat. Biotechnol. 21, 697–700 (2003) Google Scholar
  46. Wagner, A.: The Yeast Protein Interaction Network Evolves Rapidly and Contains Few Redundant Duplicate Genes. Mol. Biol. Evol. 18(7), 1283–1292 (2001) Google Scholar
  47. Weskamp, N., Hüllermeier, E., Kuhn, D., Klebe, G.: Multiple graph alignment for the structural analysis of protein active sites. IEEE/ACM Trans. Comput. Biol. Bioinform. 4(2), 310–320 (2007) Google Scholar
  48. Yu, H., Luscombe, N.M., Lu, H.X., Zhu, X., Xia, Y., Han, J.J., Bertin, N., Chung, S., Vidal, M., Gerstein, M.: Annotation Transfer Between Genomes: Protein–Protein Interologs and Protein–DNA Regulogs. Genome Res. 14, 1107–1118 (2004) Google Scholar
  49. Zhao, X., Chen, L., Aihara, K.: Gene Function Prediction with the Shortest Path in Functional Linkage Graph. OSB, 68–74 (2007) Google Scholar
  50. Zhou, X., Kao, M.C.J., Wong, W.H.: From the Cover: Transitive functional annotation by shortest-path analysis of gene expression data. PNAS 99, 12783–12788 (2002) Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Mustafa Kirac
    • 1
  • Gultekin Ozsoyoglu
    • 1
  1. 1.Case Western Reserve UniversityClevelandUSA

Personalised recommendations