Bootstrapping the Interactome: Unsupervised Identification of Protein Complexes in Yeast

  • Caroline C. Friedel
  • Jan Krumsiek
  • Ralf Zimmer
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4955)


Protein interactions and complexes are major components of biological systems. Recent genome-wide applications of tandem affinity purification (TAP) in yeast have increased significantly the available information on such interactions. From these experiments, protein complexes were predicted with different approaches first from the individual experiments only and later from their combination. The resulting predictions showed surprisingly little agreement and all of the corresponding methods rely on additional training data. In this article, we present an unsupervised algorithm for the identification of protein complexes which is independent of the availability of additional complex information. Based on a bootstrap approach, we calculated intuitive confidence scores for interactions which are more accurate than previous scoring metrics. The complexes determined from this confidence network are of similar quality as the complexes identified by the best supervised approaches. Despite the similar quality of the latest predictions and our predictions, considerable differences are still observed between all of them. Nevertheless, the set of consistently identified complexes is more than four times as large as for the first two studies. Our results illustrate that meaningful and reliable complexes can be determined from the purification experiments alone. As a consequence, the approach presented in this article is easily applicable to large-scale TAP experiments for any organism.


Gene Ontology Positive Predictive Value Bootstrap Sample Tandem Affinity Purification Saccharomyces Genome Database 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Uetz, P., et al.: A comprehensive analysis of protein-protein interactions in Saccharomyces cerevisiae. Nature 403, 623–627 (2000)CrossRefGoogle Scholar
  2. 2.
    Ito, T., et al.: A comprehensive two-hybrid analysis to explore the yeast protein interactome. Proc. Natl. Acad. Sci. USA 98, 4569–4574 (2001)CrossRefGoogle Scholar
  3. 3.
    Ho, Y., et al.: Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometry. Nature 415, 180–183 (2002)Google Scholar
  4. 4.
    Gavin, A.-C., et al.: Functional organization of the yeast proteome by systematic analysis of protein complexes. Nature 415, 141–147 (2002)CrossRefGoogle Scholar
  5. 5.
    Gavin, A.-C., et al.: Proteome survey reveals modularity of the yeast cell machinery. Nature 440, 631–636 (2006)CrossRefGoogle Scholar
  6. 6.
    Krogan, N.J., et al.: Global landscape of protein complexes in the yeast Saccharomyces cerevisiae. Nature 440, 637–643 (2006)CrossRefGoogle Scholar
  7. 7.
    von Mering, C., Krause, R., Snel, B., Cornell, M., Oliver, S.G., Fields, S., Bork, P.: Comparative assessment of large-scale data sets of protein-protein interactions. Nature 417, 399–403 (2002)CrossRefGoogle Scholar
  8. 8.
    Pu, S., Vlasblom, J., Emili, A., Greenblatt, J., Wodak, S.J.: Identifying functional modules in the physical interactome of Saccharomyces cerevisiae. Proteomics 7, 944–960 (2007)CrossRefGoogle Scholar
  9. 9.
    Collins, S.R., et al.: Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae. Mol. Cell. Proteomics 6, 439–450 (2007)Google Scholar
  10. 10.
    Hart, G.T., Lee, I., Marcotte, E.: A high-accuracy consensus map of yeast protein complexes reveals modular nature of gene essentiality. BMC Bioinformatics 8, 236 (2007)CrossRefGoogle Scholar
  11. 11.
    Mewes, H.W., et al.: MIPS: analysis and annotation of proteins from whole genomes. Nucleic Acids Res. 32, 41–44 (2004)CrossRefGoogle Scholar
  12. 12.
    Aloy, P., et al.: Structure-based assembly of protein complexes in yeast. Science 303, 2026–2029 (2004)CrossRefGoogle Scholar
  13. 13.
    Ashburner, M., et al.: Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat. Genet. 25, 25–29 (2000)CrossRefGoogle Scholar
  14. 14.
    Ruepp, A., Brauner, B., Dunger-Kaltenbach, I., Frishman, G., Montrone, C., Stransky, M., Waegele, B., Schmidt, T., Doudieu, O.N., Stümpflen, V., Mewes, H.W.: CORUM: the comprehensive resource of mammalian protein complexes. Nucleic Acids Res. 36, 646–650 (2008)CrossRefGoogle Scholar
  15. 15.
    Efron, B.: Bootstrap methods: Another look at the jackknife. The Annals of Statistics 7, 1–26 (1979)CrossRefMathSciNetzbMATHGoogle Scholar
  16. 16.
    Efron, B., Tibshirani, R.J.: An Introduction to the Bootstrap. Chapman & Hall, Boca Raton (1994)Google Scholar
  17. 17.
    van Dongen, S.: Graph Clustering by Flow Simulation. Ph.D. thesis, University of Utrecht (2000)Google Scholar
  18. 18.
    Felsenstein, J.: Confidence limits on phylogenies: an approach using the bootstrap. Evolution 39, 783–791 (1985)CrossRefGoogle Scholar
  19. 19.
    Bader, G.D., Hogue, C.W.V.: Analyzing yeast protein-protein interaction data obtained from different sources. Nat. Biotechnol. 20, 991–997 (2002)CrossRefGoogle Scholar
  20. 20.
    Brohee, S., van Helden, J.: Evaluation of clustering algorithms for protein-protein interaction networks. BMC Bioinformatics 7, 488 (2006)CrossRefGoogle Scholar
  21. 21.
    Resnik, P.: Semantic similarity in a taxonomy: an information-based measure and its application to problems of ambiguity in natural language. Journal of Artificial Intelligence Research 11, 95–130 (1999)zbMATHGoogle Scholar
  22. 22.
    Lin, D.: An information-theoretic definition of similarity. In: Proc. 15th International Conf. on Machine Learning, pp. 296–304. Morgan Kaufmann, San Francisco (1998)Google Scholar
  23. 23.
    Lord, P.W., Stevens, R.D., Brass, A., Goble, C.A.: Investigating semantic similarity measures across the Gene Ontology: the relationship between sequence and annotation. Bioinformatics 19, 1275–1283 (2003)CrossRefGoogle Scholar
  24. 24.
    Schlicker, A., Domingues, F.S., Rahnenführer, J., Lengauer, T.: A new measure for functional similarity of gene products based on Gene Ontology. BMC Bioinformatics 7, 302 (2006)CrossRefGoogle Scholar
  25. 25.
    Huh, W.-K., et al.: Global analysis of protein localization in budding yeast. Nature 425, 686–691 (2003)CrossRefGoogle Scholar
  26. 26.
    Dwight, S.S., et al.: Saccharomyces Genome Database (SGD) provides secondary gene annotation using the Gene Ontology (GO). Nucleic Acids Res. 30, 69–72 (2002)CrossRefGoogle Scholar
  27. 27.
    Fawcett, T.: An introduction to ROC analysis. Pattern Recognition 27, 861–874 (2006)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Caroline C. Friedel
    • 1
  • Jan Krumsiek
    • 1
  • Ralf Zimmer
    • 1
  1. 1.Institut für InformatikLudwig-Maximilians-Universität MünchenMünchenGermany

Personalised recommendations