B3Clustering: Identifying Protein Complexes from Protein-Protein Interaction Network

  • Eunjung Chin
  • Jia Zhu
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7808)


Cluster analysis is one of most important challenges for data mining in the modern Biology. The advance of experimental technologies have produced large amount of binary protein-protein interaction data, but it is hard to find protein complexes in vitro.We introduce new algorithm called B3Clustering which detects densely connected subgraphs from the complicated and noisy graph.

B3Clustering finds clusters by adjusting the density of subgraphs to be flexible according to its size, because the more vertices the cluster has, the less dense it becomes. B3Clustering bisects the paths with distance of 3 into two groups to select vertices from each group.We experiment B3Clustering and two other clustering methods in three different PPI networks. Then, we compare the resultant clusters from each method with benchmark complexes called CYC2008. The experimental result supports the efficiency and robustness of B3Clustering for protein complex prediction in PPI networks.


Protein Complex Maximal Clique Protein Interaction Network Predict Protein Complex Detect Protein Complex 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Adamcsek, B., Palla, G., Farkas, I.J., Derenyi, I., Vicsek, T.: CFinder: locating cliques and overlapping modules in biological networks. Bioinformatics 22(8), 1021–1023 (2006)CrossRefGoogle Scholar
  2. 2.
    Aloy, P., et al.: Structure-based assembly of protein complexes in yeast. Science 303(5666), 2026–2029 (2004)CrossRefGoogle Scholar
  3. 3.
    Altaf-Ul-Amin, M., Shinbo, Y., Mihara, K., Kurokawa, K., Kanaya, S.: Development and implementation of an algorithm for detection of protein complexes in large interaction networks. BMC Bioinformatics 7, 207 (2006)CrossRefGoogle Scholar
  4. 4.
    Bader, G., Hogue, C.: An automated method for finding molecular complexes in large protein interaction networks. MBC Bioinformatics 4, 2 (2003)Google Scholar
  5. 5.
    Blatt, M., Wiseman, S., Domany, E.: Superparamagnetic clustering of data. Phys. Rev. Lett. 76(18), 3251–3254 (1996)CrossRefGoogle Scholar
  6. 6.
    Brohëe, S., van Helen, J.: Evaluation of clustering algorithms for protein-protein interaction networks. BMC Bioinformatics 7, 488 (2006)CrossRefGoogle Scholar
  7. 7.
    Cho, Y., Hwang, W., Ramanthan, M., Zhang, A.: Semantic integration to identify overlapping functional modules inprotein interaction networks. BMC Bioinfotmatics 8, 265 (2007)CrossRefGoogle Scholar
  8. 8.
    Dwight, S.S., et al.: Saccharomyces Genome Database provides secondary gene annotation using the Gene Ontology. Nucleic Acids Research 30(1), 69–72 (2002)CrossRefGoogle Scholar
  9. 9.
    Friedel, C.C., Krumsiek, J., Zimmer, R.: Boostrapping the interactome: Unsupervised Identification of Protein Complexes in Yeast. In: Vingron, M., Wong, L. (eds.) RECOMB 2008. LNCS (LNBI), vol. 4955, pp. 3–16. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  10. 10.
    Gavin, A., Aloy, P., Grandi, P., Krause, R., Boesche, M., Marzioch, M., Rau, C., Jensen, L.J., Bastuck, S., Dumpelfeld, B., et al.: Proteome survey reveals modularity of the yeast cell machinery. Nature 440(7084), 631–636 (2006)CrossRefGoogle Scholar
  11. 11.
    Gentleman, R., Huber, W.: Making the most of high-throughput protein-interaction data. Genome Biology 8(10), 112 (2007)CrossRefGoogle Scholar
  12. 12.
    King, A., Przulj, N., Jurisica, I.: Protein complexes prediction via cost-based clustering. Bioinformatics 20(17), 3013–3020 (2004)CrossRefGoogle Scholar
  13. 13.
    Krogan, N., Cagney, G., Yu, H., Zhong, G., Guo, X., et al.: Global landscape of protein complexes in the yeast Saccharomyces cerevisiae. Nature 440(7082), 637–643 (2006)CrossRefGoogle Scholar
  14. 14.
    Leight, E., Holme, P., Newman, J.: Vertex similarity in networks. Physical Review E 73, 026120 (2006)Google Scholar
  15. 15.
    Leung, H.C., Yiu, S.M., Xiang, Q., Chin, F.Y.: Predicting Protein Complexes from PPI Data: A Core-Attachment Approach. Journal of Computational Biology 16(2), 133–144 (2009)MathSciNetCrossRefGoogle Scholar
  16. 16.
    Li, X., Foo, C., Ng, S.K.: Discovering protein complexes in dense reliable neighborhoods of protein interaction networks. In: Comput. Sys. Bioinformatics Conf., pp. 157–168 (2007)Google Scholar
  17. 17.
    Li, X., Wu, M., Kwoh, C.K., Ng, S.K.: Computational approaches for detecting protein complexes from protein interaction networks: a survey. BMC Bioinformatics 11(suppl. 1), S3 (2010)Google Scholar
  18. 18.
    Liu, G.M., Chua, H.N., Wong, L.: Complex discovery from weighted PPI networks. Bioinformatics 25(15), 1891–1897 (2009)CrossRefGoogle Scholar
  19. 19.
    Mete, M., Tang, F., Xu, X.D., Yuruk, N.: A structural approach for finding functional modules from large biological networks. BMC Bioinformatics 9(suppl. 9), S19 (2008)Google Scholar
  20. 20.
    Mewes, H.W., et al.: MIPS: Analysis and annotation of proteins from whole genomes. Nucleic Acids Res. 32(Database Issue), D41–D44 (2004)Google Scholar
  21. 21.
    Pei, J., Jiang, D., Zhang, A.: On mining cross-grph quasi-cliques. In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2005)Google Scholar
  22. 22.
    Pu, S., Wong, J., Turner, B., Cho, E., Wodak, S.J.: Up-to-date catalogues of yeast protein complexes. Nucleic Acids Res. 37(3), 825–831 (2009)CrossRefGoogle Scholar
  23. 23.
    Spirin, V., Mirny, L.: Protein complexes and functional modules in molecular networks. PNAS 100(21), 12123–12128 (2003)CrossRefGoogle Scholar
  24. 24.
    Sprinzak, E., Sattah, S., Magalit, H.: How reliable are experimental protein-protein interaction data. Journal of Molecular Biology 327(5), 919–923 (2003)CrossRefGoogle Scholar
  25. 25.
    Stark, C., Breitkreutz, B.J., Reguly, T., Boucher, L., Breitkreutz, A., Tyers, M.: Biogrid: A general Repository for Interaction Datasets. Nucleic Acids Res. 34, D535–D539 (2006)Google Scholar
  26. 26.
    Tomita, E., Tanaka, A., Takahashi, H.: The worst-case time complexity for generating all maximal cliques and computational experiments. Theoretical Computer Sciene 363, 28–42 (2006)MathSciNetzbMATHCrossRefGoogle Scholar
  27. 27.
    Van Dongen, S.: Graph Clustering by Flow Stimulation. University of Utrecht (2000)Google Scholar
  28. 28.
    Wu, D.D., Hu, X.: An efficient approach to detect a protein community from a seed. In: 2005 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB 2005), pp. 135–141. IEEE, La Jolla (2005)Google Scholar
  29. 29.
    Wu, M., Li, X., Kwoh, C.K., Ng, S.K.: A core-attachment based method to detect protein complexes in PPI networks. BMC Bioinformatics 10, 169 (2009)CrossRefGoogle Scholar
  30. 30.
    Xenarios, I., Salwinski, L., Duan, X., Higney, P., Kim, S., Eisenberg, D.: DIP, the Database of Interacting Proteins: a research tool for studying cellular networks of ptoein interactions. Nucleic Acids Research 30, 303–305 (2002)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Eunjung Chin
    • 1
  • Jia Zhu
    • 1
  1. 1.School of ITEEThe University of QueenslandAustralia

Personalised recommendations