Toward Large-Scale Computational Prediction of Protein Complexes

  • Simone Rizzetto
  • Attila Csikász-Nagy
Part of the Methods in Molecular Biology book series (MIMB, volume 1819)


Cellular functions are often performed by multiprotein structures called protein complexes. These complexes are dynamic structures that evolve during the cell cycle or in response to external and internal stimuli, and are tightly regulated by protein expression in different tissues resulting in quantitative and qualitative variation of protein complexes. Advances in high-throughput techniques, such as mass-spectrometry and yeast two-hybrid provided a large amount of data on protein–protein interactions. This sparked the development of computational methods able to predict protein complex formation under a variety of biological and clinical conditions. However, the challenges that need to be addressed for successful computational protein complex prediction are highly complex.

The post-genomic era saw an emerging number of algorithms and software, which are able to predict protein complexes from protein–protein interaction networks and a variety of other sources. Despite the high capacity of these methods to qualitatively predict protein complexes, they could provide only limited or no quantitative information of the predicted complexes. Recently, a new large-scale simulation of protein complexes was able to achieve this task by simulating protein complex formation on the proteome scale.

In this chapter, we review representative methods that can predict multiple protein complexes at different scales and discuss how these can be combined with emerging sources of data in order to improve protein complex characterization.

Key words

Protein complexes Protein interactions Proteome-wide simulations Complexome Interactome Disease-associated protein complexes 


  1. 1.
    Gavin AC, Aloy P, Grandi P, Krause R, Boesche M, Marzioch M, Rau C, Jensen LJ, Bastuck S, Dumpelfeld B, Edelmann A, Heurtier MA, Hoffman V, Hoefert C, Klein K, Hudak M, Michon AM, Schelder M, Schirle M, Remor M, Rudi T, Hooper S, Bauer A, Bouwmeester T, Casari G, Drewes G, Neubauer G, Rick JM, Kuster B, Bork P, Russell RB, Superti-Furga G (2006) Proteome survey reveals modularity of the yeast cell machinery. Nature 440(7084):631–636. https:// Scholar
  2. 2.
    Havugimana Pierre C, Hart GT, Nepusz T, Yang H, Turinsky Andrei L, Li Z, Wang Peggy I, Boutz Daniel R, Fong V, Phanse S, Babu M, Craig Stephanie A, Hu P, Wan C, Vlasblom J, Dar V-u-N, Bezginov A, Clark Gregory W, Wu Gabriel C, Wodak Shoshana J, Tillier Elisabeth RM, Paccanaro A, Marcotte Edward M, Emili A (2012) A census of human soluble protein complexes. Cell 150(5):1068–1081. https:// Scholar
  3. 3.
    Hein Marco Y, Hubner Nina C, Poser I, Cox J, Nagaraj N, Toyoda Y, Gak Igor A, Weisswange I, Mansfeld J, Buchholz F, Hyman Anthony A, Mann M (2015) A human interactome in three quantitative dimensions organized by stoichiometries and abundances. Cell 163(3):712–723. https:// Scholar
  4. 4.
    de Lichtenberg U, Jensen LJ, Brunak S, Bork P (2005) Dynamic complex formation during the yeast cell cycle. Science 307(5710):724–727. https:// Scholar
  5. 5.
    Bader GD, Hogue CW (2002) Analyzing yeast protein–protein interaction data obtained from different sources. Nat Biotechnol 20(10):991–997CrossRefGoogle Scholar
  6. 6.
    Srihari S, Yong CH, Patil A, Wong L (2015) Methods for protein complex prediction and their contributions towards understanding the organisation, function and dynamics of complexes. FEBS Lett. https:// Scholar
  7. 7.
    Sanbonmatsu KY (2012) Computational studies of molecular machines: the ribosome. Curr Opin Struct Biol 22(2):168–174. https:// Scholar
  8. 8.
    Perilla JR, Goh BC, Cassidy CK, Liu B, Bernardi RC, Rudack T, Yu H, Wu Z, Schulten K (2015) Molecular dynamics simulations of large macromolecular complexes. Curr Opin Struct Biol 31:64–74. https:// Scholar
  9. 9.
    Suderman R, Deeds EJ (2013) Machines vs. ensembles: effective MAPK signaling through heterogeneous sets of protein complexes. PLoS Comput Biol 9(10):e1003278. https:// Scholar
  10. 10.
    Deeds EJ, Krivine J, Feret J, Danos V, Fontana W (2012) Combinatorial complexity and compositional drift in protein interaction networks. PLoS One 7(3):e32032. https:// Scholar
  11. 11.
    Beyer A, Wilhelm T (2004) Dynamic simulation of protein complex formation on a genomic scale. Bioinformatics 21(8):1610–1616. https:// Scholar
  12. 12.
    Osmanović D, Rabin Y (2016) Effect of non-specific interactions on formation and stability of specific complexes. J Chem Phys 144(20):205104. https:// Scholar
  13. 13.
    Rizzetto S, Priami C, Csikasz-Nagy A (2015) Qualitative and quantitative protein complex prediction through proteome-wide simulations. PLoS Comput Biol 11(10):e1004424. https:// Scholar
  14. 14.
    Clancy T, Hovig E (2014) From proteomes to complexomes in the era of systems biology. Proteomics 14(1):24–41. https:// Scholar
  15. 15.
    Bader GD, Hogue CW (2003) An automated method for finding molecular complexes in large protein interaction networks. BMC Bioinformatics 4:2CrossRefGoogle Scholar
  16. 16.
    Nepusz T, Yu H, Paccanaro A (2012) Detecting overlapping protein complexes in protein-protein interaction networks. Nat Methods 9(5):471–472CrossRefGoogle Scholar
  17. 17.
    Palla G, Derényi I, Farkas I, Vicsek T (2005) Uncovering the overlapping community structure of complex networks in nature and society. Nature 435(7043):814–818. https:// Scholar
  18. 18.
    Richardson JS (1981) The anatomy and taxonomy of protein structure. Adv Protein Chem 34:167–339. https:// Scholar
  19. 19.
    Ozawa Y, Saito R, Fujimori S, Kashima H, Ishizaka M, Yanagawa H, Miyamoto-Sato E, Tomita M (2010) Protein complex prediction via verifying and reconstructing the topology of domain-domain interactions. BMC Bioinformatics 11:350. https:// Scholar
  20. 20.
    Ma W, McAnulla C, Wang L (2012) Protein complex prediction based on maximum matching with domain-domain interaction. Biochim Biophys Acta 1824(12):1418–1424. https:// Scholar
  21. 21.
    Xu B, Lin H, Chen Y, Yang Z, Liu H (2013) Protein complex identification by integrating protein-protein interaction evidence from multiple sources. PLoS One 8(12):e83841. https:// Scholar
  22. 22.
    Wang J, Peng X, Li M, Pan Y (2013) Construction and application of dynamic protein interaction network based on time course gene expression data. Proteomics 13(2):301–312. https:// Scholar
  23. 23.
    Zhang Y, Lin H, Yang Z, Wang J (2016) Construction of dynamic probabilistic protein interaction networks for protein complex identification. BMC Bioinformatics 17(1). https:// Scholar
  24. 24.
    Zhang Y, Lin H, Yang Z, Wang J, Xu B (2013) Integrating multiple biomedical resources for protein complex prediction. IEEE, Shanghai, pp 456–459. https:// Scholar
  25. 25.
    Li X, Wang J, Zhao B, Wu F-X, Pan Y (2016) Identification of protein complexes from multi-relationship protein interaction networks. Hum Genomics 10(S2). https:// Scholar
  26. 26.
    SV Dongen (2000) Graph clustering by flow simulation. Ph.D. thesis, University of UtrechtGoogle Scholar
  27. 27.
    Bernaschi M, Castiglione F, Ferranti A, Gavrila C, Tinti M, Cesareni G (2007) ProtNet: a tool for stochastic simulations of protein interaction networks dynamics. BMC Bioinformatics 8(Suppl 1):S4. https:// Scholar
  28. 28.
    Galeota E, Gravila C, Castiglione F, Bernaschi M, Cesareni G (2015) The hierarchical organization of natural protein interaction networks confers self-organization properties on pseudocells. BMC Syst Biol 9(Suppl 3):S3. https:// Scholar
  29. 29.
    Xie Z-R, Chen J, Wu Y (2016) Multiscale model for the assembly kinetics of protein complexes. J Phys Chem B 120(4):621–632. https:// Scholar
  30. 31.
    Mewes H-W, Frishman D, Mayer KF, Münsterkötter M, Noubibou O, Pagel P, Rattei T, Oesterheld M, Ruepp A, Stümpflen V (2006) MIPS: analysis and annotation of proteins from whole genomes in 2005. Nucleic Acids Res 34(suppl 1):D169–D172CrossRefGoogle Scholar
  31. 32.
    Pu S, Wong J, Turner B, Cho E, Wodak SJ (2009) Up-to-date catalogues of yeast protein complexes. Nucleic Acids Res 37(3):825–831. https:// Scholar
  32. 33.
    Shen X, Yi L, Jiang X, Zhao Y, Hu X, He T, Yang J (2016) Neighbor affinity based algorithm for discovering temporal protein complex from dynamic PPI network. Methods 110:90–96. https:// Scholar
  33. 34.
    Gillespie DT (2001) Approximate accelerated stochastic simulation of chemically reacting systems. J Chem Phys 115(4):1716. https:// Scholar
  34. 35.
    Letunic I, Doerks T, Bork P (2014) SMART: recent updates, new developments and status in 2015. Nucleic Acids Res 43(D1):D257–D260. https:// Scholar
  35. 41.
    Mavrich TN, Ioshikhes IP, Venters BJ, Jiang C, Tomsho LP, Qi J, Schuster SC, Albert I, Pugh BF (2008) A barrier nucleosome model for statistical positioning of nucleosomes throughout the yeast genome. Genome Res 18(7):1073–1083. https:// Scholar
  36. 42.
    Schaefer MH, Fontaine JF, Vinayagam A, Porras P, Wanker EE, Andrade-Navarro MA (2012) HIPPIE: integrating protein interaction networks with experiment based quality scores. PLoS One 7(2):e31826. https:// Scholar
  37. 43.
    Yellaboina S, Tasneem A, Zaykin DV, Raghavachari B, Jothi R (2010) DOMINE: a comprehensive collection of known and predicted domain-domain interactions. Nucleic Acids Res 39(Database):D730–D735. https:// Scholar
  38. 44.
    Kim Y, Min B, Yi GS (2012) IDDI: integrated domain-domain interaction and protein interaction analysis system. Proteome Sci 10(Suppl 1):S9. https:// Scholar
  39. 45.
    Garzón JI, Deng L, Murray D, Shapira S, Petrey D, Honig B (2016) A computational interactome and functional annotation for the human proteome. eLife 5:pii: e18715. https:// Scholar
  40. 46.
    Ahnert SE, Marsh JA, Hernandez H, Robinson CV, Teichmann SA (2015) Principles of assembly reveal a periodic table of protein complexes. Science 350(6266):aaa2245. https:// Scholar
  41. 47.
    Acuner Ozbabacan SE, Engin HB, Gursoy A, Keskin O (2011) Transient protein-protein interactions. Protein Eng Des Sel 24(9):635–648. https:// Scholar
  42. 48.
    Ghaemmaghami S, Huh W-K, Bower K, Howson RW, Belle A, Dephoure N, O'Shea EK, Weissman JS (2003) Global analysis of protein expression in yeast. Nature 425(6959):737–741CrossRefGoogle Scholar
  43. 49.
    Uhlen M, Fagerberg L, Hallstrom BM, Lindskog C, Oksvold P, Mardinoglu A, Sivertsson A, Kampf C, Sjostedt E, Asplund A, Olsson I, Edlund K, Lundberg E, Navani S, Szigyarto CAK, Odeberg J, Djureinovic D, Takanen JO, Hober S, Alm T, Edqvist PH, Berling H, Tegel H, Mulder J, Rockberg J, Nilsson P, Schwenk JM, Hamsten M, von Feilitzen K, Forsberg M, Persson L, Johansson F, Zwahlen M, von Heijne G, Nielsen J, Ponten F (2015) Tissue-based map of the human proteome. Science 347(6220):1260419–1260419. https:// Scholar
  44. 50.
    Eisenberg E, Levanon EY (2013) Human housekeeping genes, revisited. Trends Genet 29(10):569–574. https:// Scholar
  45. 51.
    Mayne J, Ning Z, Zhang X, Starr AE, Chen R, Deeke S, Chiang C-K, Xu B, Wen M, Cheng K, Seebun D, Star A, Moore JI, Figeys D (2016) Bottom-up proteomics (2013–2015): keeping up in the era of systems biology. Anal Chem 88(1):95–121. https:// Scholar
  46. 52.
    Zieske LR (2006) A perspective on the use of iTRAQTM reagent technology for protein complex and profiling studies. J Exp Bot 57(7):1501–1508. https:// Scholar
  47. 53.
    Schwanhäusser B, Busse D, Li N, Dittmar G, Schuchhardt J, Wolf J, Chen W, Selbach M (2011) Global quantification of mammalian gene expression control. Nature 473(7347):337–342. https:// Scholar
  48. 54.
    Kim MS, Pinto SM, Getnet D, Nirujogi RS, Manda SS, Chaerkady R, Madugundu AK, Kelkar DS, Isserlin R, Jain S, Thomas JK, Muthusamy B, Leal-Rojas P, Kumar P, Sahasrabuddhe NA, Balakrishnan L, Advani J, George B, Renuse S, Selvan LD, Patil AH, Nanjappa V, Radhakrishnan A, Prasad S, Subbannayya T, Raju R, Kumar M, Sreenivasamurthy SK, Marimuthu A, Sathe GJ, Chavan S, Datta KK, Subbannayya Y, Sahu A, Yelamanchi SD, Jayaram S, Rajagopalan P, Sharma J, Murthy KR, Syed N, Goel R, Khan AA, Ahmad S, Dey G, Mudgal K, Chatterjee A, Huang TC, Zhong J, Wu X, Shaw PG, Freed D, Zahari MS, Mukherjee KK, Shankar S, Mahadevan A, Lam H, Mitchell CJ, Shankar SK, Satishchandra P, Schroeder JT, Sirdeshmukh R, Maitra A, Leach SD, Drake CG, Halushka MK, Prasad TS, Hruban RH, Kerr CL, Bader GD, Iacobuzio-Donahue CA, Gowda H, Pandey A (2014) A draft map of the human proteome. Nature 509(7502):575–581. https:// Scholar
  49. 56.
    Schmidt A, Kochanowski K, Vedelaar S, Ahrne E, Volkmer B, Callipo L, Knoops K, Bauer M, Aebersold R, Heinemann M (2016) The quantitative and condition-dependent Escherichia coli proteome. Nat Biotechnol 34(1):104–110. https:// Scholar
  50. 57.
    Lawrence Robert T, Perez Elizabeth M, Hernández D, Miller Chris P, Haas Kelsey M, Irie Hanna Y, Lee S-I, Blau CA, Villén J (2015) The proteomic landscape of triple-negative breast cancer. Cell Rep 11(4):630–644. https:// Scholar
  51. 58.
    Hukelmann JL, Anderson KE, Sinclair LV, Grzes KM, Murillo AB, Hawkins PT, Stephens LR, Lamond AI, Cantrell DA (2016) The cytotoxic T cell proteome and its shaping by the kinase mTOR. Nat Immunol 17(1):104–112. https:// Scholar
  52. 59.
    Kolker E, Higdon R, Haynes W, Welch D, Broomall W, Lancet D, Stanberry L, Kolker N (2012) MOPED: model organism protein expression database. Nucleic Acids Res 40(Database issue):D1093–D1099. https:// Scholar
  53. 60.
    Wang M, Weiss M, Simonovic M, Haertinger G, Schrimpf SP, Hengartner MO, von Mering C (2012) PaxDb, a database of protein abundance averages across all three domains of life. Mol Cell Proteomics 11(8):492–500. https:// Scholar
  54. 61.
    Binder JX, Pletscher-Frankild S, Tsafou K, Stolte C, O'Donoghue SI, Schneider R, Jensen LJ (2014) COMPARTMENTS: unification and visualization of protein subcellular localization evidence. Database 2014:bau012. https:// Scholar
  55. 62.
    Emanuelsson O, Nielsen H, Brunak S, von Heijne G (2000) Predicting subcellular localization of proteins based on their N-terminal amino acid sequence. J Mol Biol 300(4):1005–1016. https:// Scholar
  56. 66.
    Satija R, Farrell JA, Gennert D, Schier AF, Regev A (2015) Spatial reconstruction of single-cell gene expression data. Nat Biotechnol 33(5):495–502. https:// Scholar
  57. 67.
    Newell EW, Cheng Y (2016) Mass cytometry: blessed with the curse of dimensionality. Nat Immunol 17(8):890–895. https:// Scholar
  58. 68.
    Costanzo M, VanderSluis B, Koch EN, Baryshnikova A, Pons C, Tan G, Wang W, Usaj M, Hanchard J, Lee SD, Pelechano V, Styles EB, Billmann M, van Leeuwen J, van Dyk N, Lin ZY, Kuzmin E, Nelson J, Piotrowski JS, Srikumar T, Bahr S, Chen Y, Deshpande R, Kurat CF, Li SC, Li Z, Usaj MM, Okada H, Pascoe N, San Luis BJ, Sharifpoor S, Shuteriqi E, Simpkins SW, Snider J, Suresh HG, Tan Y, Zhu H, Malod-Dognin N, Janjic V, Przulj N, Troyanskaya OG, Stagljar I, Xia T, Ohya Y, Gingras AC, Raught B, Boutros M, Steinmetz LM, Moore CL, Rosebrock AP, Caudy AA, Myers CL, Andrews B, Boone C (2016) A global genetic interaction network maps a wiring diagram of cellular function. Science 353(6306). https:// Scholar
  59. 69.
    Chiu YL, Cao H, Rana TM (2007) Quantitative analysis of RNA-mediated protein-protein interactions in living cells by FRET. Chem Biol Drug Des 69(4):233–239. https:// Scholar
  60. 70.
    Nilsson T, Lundin CR, Nordlund G, Adelroth P, von Ballmoos C, Brzezinski P (2016) Lipid-mediated protein-protein interactions modulate respiration-driven ATP synthesis. Sci Rep 6:24113. https:// Scholar
  61. 71.
    Giudice G, Sánchez-Cabo F, Torroja C, Lara-Pezzi E (2016) ATtRACT—a database of RNA-binding proteins and associated motifs. Database 2016:baw035. https:// Scholar
  62. 72.
    Zanegina O, Kirsanov D, Baulin E, Karyagina A, Alexeevski A, Spirin S (2016) An updated version of NPIDB includes new classifications of DNA–protein complexes and their families. Nucleic Acids Res 44(D1):D144–D153. https:// Scholar
  63. 73.
    Yachie N, Saito R, Sugiyama N, Tomita M, Ishihama Y (2011) Integrative features of the yeast phosphoproteome and protein-protein interaction map. PLoS Comput Biol 7(1):e1001064. https:// Scholar
  64. 74.
    Duan G, Walther D (2015) The roles of post-translational modifications in the context of protein interaction networks. PLoS Comput Biol 11(2):e1004049. https:// Scholar
  65. 75.
    Huang KY, Su MG, Kao HJ, Hsieh YC, Jhong JH, Cheng KH, Huang HD, Lee TY (2016) dbPTM 2016: 10-year anniversary of a resource for post-translational modification of proteins. Nucleic Acids Res 44(D1):D435–D446. https:// Scholar
  66. 76.
    Ori A, Iskar M, Buczak K, Kastritis P, Parca L, Andrés-Pons A, Singer S, Bork P, Beck M (2016) Spatiotemporal variation of mammalian protein complex stoichiometries. Genome Biol 17(1). https:// Scholar
  67. 77.
    Gosens I, den Hollander AI, Cremers FP, Roepman R (2008) Composition and function of the crumbs protein complex in the mammalian retina. Exp Eye Res 86(5):713–726. https:// Scholar
  68. 78.
    Hung MC, Link W (2011) Protein localization in disease and therapy. J Cell Sci 124(20):3381–3392. https:// Scholar
  69. 79.
    Rodina A, Wang T, Yan P, Gomes ED, Dunphy MPS, Pillarsetty N, Koren J, Gerecitano JF, Taldone T, Zong H, Caldas-Lopes E, Alpaugh M, Corben A, Riolo M, Beattie B, Pressl C, Peter RI, Xu C, Trondl R, Patel HJ, Shimizu F, Bolaender A, Yang C, Panchal P, Farooq MF, Kishinevsky S, Modi S, Lin O, Chu F, Patil S, Erdjument-Bromage H, Zanzonico P, Hudis C, Studer L, Roboz GJ, Cesarman E, Cerchietti L, Levine R, Melnick A, Larson SM, Lewis JS, Guzman ML, Chiosis G (2016) The epichaperome is an integrated chaperome network that facilitates tumour survival. Nature 538(7625):397–401. https:// Scholar
  70. 80.
    Lage K, Karlberg EO, Størling ZM, Ólason PÍ, Pedersen AG, Rigina O, Hinsby AM, Tümer Z, Pociot F, Tommerup N, Moreau Y, Brunak S (2007) A human phenome-interactome network of protein complexes implicated in genetic disorders. Nat Biotechnol 25(3):309–316. https:// Scholar
  71. 81.
    Le DH (2015) A novel method for identifying disease associated protein complexes based on functional similarity protein complex networks. Algorithms Mol Biol 10:14. https:// Scholar
  72. 82.
    Vinayagam A, Hu Y, Kulkarni M, Roesel C, Sopko R, Mohr SE, Perrimon N (2013) Protein complex-based analysis framework for high-throughput data sets. Sci Signal 6(264):rs5. https:// Scholar
  73. 83.
    Wu M, Yu Q, Li X, Zheng J, Huang JF, Kwoh CK (2013) Benchmarking human protein complexes to investigate drug-related systems and evaluate predicted protein complexes. PLoS One 8(2):e53197. https:// Scholar
  74. 84.
    Nacher JC, Schwartz JM (2012) Modularity in protein complex and drug interactions reveals new polypharmacological properties. PLoS One 7(1):e30028. https:// Scholar
  75. 85.
    Hart JR, Zhang Y, Liao L, Ueno L, Du L, Jonkers M, Yates JR, Vogt PK (2015) The butterfly effect in cancer: a single base mutation can remodel the cell. Proc Natl Acad Sci 112(4):1131–1136. https:// Scholar
  76. 86.
    Collins SR, Kemmeren P, Zhao X-C, Greenblatt JF, Spencer F, Holstege FC, Weissman JS, Krogan NJ (2007) Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae. Mol Cell Proteomics 6(3):439–450CrossRefGoogle Scholar
  77. 87.
    Azevedo H, Moreira-Filho CA (2015) Topological robustness analysis of protein interaction networks reveals key targets for overcoming chemotherapy resistance in glioma. Sci Rep 5:16830. https:// Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  • Simone Rizzetto
    • 1
    • 2
  • Attila Csikász-Nagy
    • 3
    • 4
  1. 1.School of Medical SciencesKensingtonAustralia
  2. 2.Viral Immunology Systems Program, Kirby Institute for Infection and ImmunityKensingtonAustralia
  3. 3.Randall Division of Cell and Molecular Biophysics, Institute for Mathematical and Molecular BiomedicineKing’s College LondonLondonUK
  4. 4.Faculty of Information Technology and BionicsPázmány Péter Catholic UniversityBudapestHungary

Personalised recommendations