Skip to main content

Detection of Locally Over-Represented GO Terms in Protein-Protein Interaction Networks

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 5541))

Abstract

High-throughput methods for identifying protein-protein interactions produce increasingly complex and intricate interaction networks. These networks are extremely rich in information, but extracting biologically meaningful hypotheses from them and representing them in a human-readable manner is challenging. We propose a method to identify Gene Ontology terms that are locally over-represented in a subnetwork of a given biological network. Specifically, we propose two methods to evaluate the degree of clustering of proteins associated to a particular GO term and describe four efficient methods to estimate the statistical significance of the observed clustering. We show, using Monte Carlo simulations, that our best approximation methods accurately estimate the true p-value, for random scale-free graphs as well as for actual yeast and human networks. When applied to these two biological networks, our approach recovers many known complexes and pathways, but also suggests potential functions for many subnetworks.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Al-Shahrour, F., Daz-Uriarte, R., Dopazo, J.: FatiGO: a web tool for finding significant associations of Gene Ontology terms with groups of genes. Bioinformatics 20(4), 578–580 (2004)

    Article  CAS  PubMed  Google Scholar 

  2. Ashburner, M., Ball, C.A., Blake, J.A., Botstein, D., Butler, H., Cherry, J.M., Davis, A.P., Dolinski, K., Dwight, S.S., Eppig, J.T., Harris, M.A., Hill, D.P., Issel-Tarver, L., Kasarskis, A., Lewis, S., Matese, J.C., Richardson, J.E., Ringwald, M., Rubin, G.M., Go, G.S.: Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat. Genet. 25(1), 25–29 (2000)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  3. Barabasi, Albert: Emergence of scaling in random networks. Science 286(5439), 509–512 (1999)

    Article  CAS  PubMed  Google Scholar 

  4. Barboric, M., Kohoutek, J., Price, J.P., Blazek, D., Price, D.H., Peterlin, B.M.: Interplay between 7SK snRNA and oppositely charged regions in HEXIM1 direct the inhibition of P-TEFb. EMBO J. 24(24), 4291–4303 (2005)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  5. Beissbarth, T., Speed, T.P.: GOstat: find statistically overrepresented Gene Ontologies within a group of genes. Bioinformatics 20(9), 1464–1465 (2004)

    Article  CAS  PubMed  Google Scholar 

  6. Brohe, S., Faust, K., Lima-Mendez, G., Vanderstocken, G., van Helden, J.: Network Analysis Tools: from biological networks to clusters and pathways. Nat. Protoc. 3(10), 1616–1629 (2008)

    Article  Google Scholar 

  7. Brohe, S., van Helden, J.: Evaluation of clustering algorithms for protein-protein interaction networks. BMC Bioinformatics 7, 488 (2006)

    Article  Google Scholar 

  8. Byers, S., Price, J., Cooper, J., Li, Q., Price, D.: HEXIM2, a HEXIM1-related protein, regulates positive transcription elongation factor b through association with 7SK. J Biol. Chem. 280(16), 16360–16367 (2005)

    Article  CAS  PubMed  Google Scholar 

  9. Chua, H., Sung, W., Wong, L.: Exploiting indirect neighbours and topological weight to predict protein function from protein-protein interactions. Bioinformatics 22(13), 1623–1630 (2006)

    Article  CAS  PubMed  Google Scholar 

  10. Coulombe, B., Blanchette, M., Jeronimo, C.: Steps towards a repertoire of comprehensive maps of human protein interaction networks: the Human Proteotheque Initiative (HuPI). Biochem. Cell Biol. 86(2), 149–156 (2008)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  11. Daraselia, N., Yuryev, A., Egorov, S., Mazo, I., Ispolatov, I.: Automatic extraction of gene ontology annotation and its correlation with clusters in protein networks. BMC Bioinformatics 8, 243 (2007)

    Article  PubMed  PubMed Central  Google Scholar 

  12. Enright, A.J., Dongen, S.V., Ouzounis, C.A.: An efficient algorithm for large-scale detection of protein families. Nucleic Acids Res. 30(7), 1575–1584 (2002)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  13. Finn, R.D., Tate, J., Mistry, J., Coggill, P.C., Sammut, S.J., Hotz, H.-R., Ceric, G., Forslund, K., Eddy, S.R., Sonnhammer, E.L.L., Bateman, A.: The Pfam protein families database. Nucleic Acids Res. 36(Database issue), D281–D288 (2008)

    Google Scholar 

  14. Floyd, R.W.: Algorithm 97: Shortest path. Communications of the ACM 5(6), 345 (1962)

    Article  Google Scholar 

  15. Hu, Z., Mellor, J., DeLisi, C.: Analyzing networks with VisANT. Curr Protoc Bioinformatics, Chapter 8:Unit 8.8 (December 2004)

    Google Scholar 

  16. Jeronimo, C., Forget, D., Bouchard, A., Li, Q., Chua, G., Poitras, C., Thrien, C., Bergeron, D., Bourassa, S., Greenblatt, J., Chabot, B., Poirier, G.G., Hughes, T.R., Blanchette, M., Price, D.H., Coulombe, B.: Systematic analysis of the protein interaction network for the human transcription machinery reveals the identity of the 7SK capping enzyme. Mol. Cell 27(2), 262–274 (2007)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. Kanehisa, M., Araki, M., Goto, S., Hattori, M., Hirakawa, M., Itoh, M., Katayama, T., Kawashima, S., Okuda, S., Tokimatsu, T., Yamanishi, Y.: KEGG for linking genomes to life and the environment. Nucleic Acids Res. 36(Database issue), D480–D484 (2008)

    Google Scholar 

  18. Kondor, R.I., Lafferty, J.: Diffusion kernels on graphs and other discrete structures. In: Proceedings of the ICML, pp. 315–322 (2002)

    Google Scholar 

  19. Krogan, N.J., Cagney, G., Yu, H., Zhong, G., Guo, X., Ignatchenko, A., Li, J., Pu, S., Datta, N., Tikuisis, A.P., Punna, T., Peregrn-Alvarez, J.M., Shales, M., Zhang, X., Davey, M., Robinson, M.D., Paccanaro, A., Bray, J.E., Sheung, A., Beattie, B., Richards, D.P., Canadien, V., Lalev, A., Mena, F., Wong, P., Starostine, A., Canete, M.M., Vlasblom, J., Wu, S., Orsi, C., Collins, S.R., Chandran, S., Haw, R., Rilstone, J.J., Gandi, K., Thompson, N.J., Musso, G., Onge, P.S., Ghanny, S., Lam, M.H.Y., Butland, G., Altaf-Ul, A.M., Kanaya, S., Shilatifard, A., O’Shea, E., Weissman, J.S., Ingles, C.J., Hughes, T.R., Parkinson, J., Gerstein, M., Wodak, S.J., Emili, A., Greenblatt, J.F.: Global landscape of protein complexes in the yeast Saccharomyces cerevisiae. Nature 440(7084), 637–643 (2006)

    Article  CAS  PubMed  Google Scholar 

  20. Li, Y., Agarwal, P., Rajagopalan, D.: A global pathway crosstalk network. Bioinformatics 24(12), 1442–1447 (2008)

    Article  CAS  PubMed  Google Scholar 

  21. Mete, M., Tang, F., Xu, X., Yuruk, N.: A structural approach for finding functional modules from large biological networks. BMC Bioinformatics 9(suppl. 9), S19 (2008)

    Article  Google Scholar 

  22. Ohbayashi, T., Makino, Y., Tamura, T.A.: Identification of a mouse TBP-like protein (TLP) distantly related to the drosophila TBP-related factor. Nucleic Acids Res. 27(3), 750–755 (1999)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  23. Peng, J., Zhu, Y., Milton, J., Price, D.: Identification of multiple cyclin subunits of human P-TEFb. Genes Dev 12(5), 755–762 (1998)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  24. Przulj, N., Corneil, D.G., Jurisica, I.: Modeling interactome: scale-free or geometric? Bioinformatics 20(18), 3508–3515 (2004)

    Article  CAS  PubMed  Google Scholar 

  25. Said, M., Begley, T., Oppenheim, A., Lauffenburger, D., Samson, L.: Global network analysis of phenotypic effects: protein networks and toxicity modulation in saccharomyces cerevisiae. Proc. Natl. Acad. Sci. USA 101(52), 18006–18011 (2004)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  26. Scott, J., Ideker, T., Karp, R.M., Sharan, R.: Efficient algorithms for detecting signaling pathways in protein interaction networks. J. Comput. Biol. 13(2), 133–144 (2006)

    Article  CAS  PubMed  Google Scholar 

  27. Sen, T.Z., Kloczkowski, A., Jernigan, R.L.: Functional clustering of yeast proteins from the protein-protein interaction network. BMC Bioinformatics 7, 355 (2006)

    Article  PubMed  PubMed Central  Google Scholar 

  28. Shannon, P., Markiel, A., Ozier, O., Baliga, N.S., Wang, J.T., Ramage, D., Amin, N., Schwikowski, B., Ideker, T.: Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res. 13(11), 2498–2504 (2003)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  29. Sharan, R., Ulitsky, I., Shamir, R.: Network-based prediction of protein function. Mol. Syst. Biol. 3, 88 (2007)

    Article  PubMed  PubMed Central  Google Scholar 

  30. Shlomi, T., Segal, D., Ruppin, E., Sharan, R.: QPath: a method for querying pathways in a protein-protein interaction network. BMC Bioinformatics 7, 199 (2006)

    Article  PubMed  PubMed Central  Google Scholar 

  31. Subramanian, A., Tamayo, P., Mootha, V.K., Mukherjee, S., Ebert, B.L., Gillette, M.A., Paulovich, A., Pomeroy, S.L., Golub, T.R., Lander, E.S., Mesirov, J.P.: Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc. Natl. Acad. Sci. USA 102(43), 15545–15550 (2005)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  32. Suderman, M., Hallett, M.: Tools for visually exploring biological networks. Bioinformatics 23(20), 2651–2659 (2007)

    Article  CAS  PubMed  Google Scholar 

  33. Warshall, S.: A theorem on boolean matrices. Journal of the ACM 9(1), 11–12 (1962)

    Article  Google Scholar 

  34. Zeeberg, B.R., Feng, W., Wang, G., Wang, M.D., Fojo, A.T., Sunshine, M., Narasimhan, S., Kane, D.W., Reinhold, W.C., Lababidi, S., Bussey, K.J., Riss, J., Barrett, J.C., Weinstein, J.N.: GoMiner: a resource for biological interpretation of genomic and proteomic data. Genome Biol. 4(4), R28 (2003)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Lavallée-Adam, M., Coulombe, B., Blanchette, M. (2009). Detection of Locally Over-Represented GO Terms in Protein-Protein Interaction Networks. In: Batzoglou, S. (eds) Research in Computational Molecular Biology. RECOMB 2009. Lecture Notes in Computer Science(), vol 5541. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-02008-7_23

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-02008-7_23

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-02007-0

  • Online ISBN: 978-3-642-02008-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics