Skip to main content

Integrating In Silico Resources to Map a Signaling Network

  • Protocol
  • First Online:
Gene Function Analysis

Part of the book series: Methods in Molecular Biology ((MIMB,volume 1101))

Abstract

The abundance of publicly available life science databases offers a wealth of information that can support interpretation of experimentally derived data and greatly enhance hypothesis generation. Protein interaction and functional networks are not simply new renditions of existing data: they provide the opportunity to gain insights into the specific physical and functional role a protein plays as part of the biological system. In this chapter, we describe different in silico tools that can quickly and conveniently retrieve data from existing data repositories and we discuss how the available tools are best utilized for different purposes. While emphasizing protein–protein interaction databases (e.g., BioGrid and IntAct), we also introduce metasearch platforms such as STRING and GeneMANIA, pathway databases (e.g., BioCarta and Pathway Commons), text mining approaches (e.g., PubMed and Chilibot), and resources for drug–protein interactions, genetic information for model organisms and gene expression information based on microarray data mining. Furthermore, we provide a simple step-by-step protocol for building customized protein–protein interaction networks in Cytoscape, a powerful network assembly and visualization program, integrating data retrieved from these various databases. As we illustrate, generation of composite interaction networks enables investigators to extract significantly more information about a given biological system than utilization of a single database or sole reliance on primary literature.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Szklarczyk D et al (2011) The STRING database in 2011: functional interaction networks of proteins, globally integrated and scored. Nucleic Acids Res 39(Database issue):D561–D568

    Article  PubMed  CAS  Google Scholar 

  2. Mostafavi S et al (2008) GeneMANIA: a real-time multiple association network integration algorithm for predicting gene function. Genome Biol 9(Suppl 1):S4

    Article  PubMed  Google Scholar 

  3. Dempsey K et al (2012) Functional identification in correlation networks using gene ontology edge annotation. Int J Comput Biol Drug Des 5(3–4):222–244

    Article  PubMed  Google Scholar 

  4. Smoot ME et al (2011) Cytoscape 2.8: new features for data integration and network visualization. Bioinformatics 27(3):431–432

    Article  PubMed  CAS  Google Scholar 

  5. Hu Z et al (2009) VisANT 3.5: multi-scale network visualization, analysis and inference based on the gene ontology. Nucleic Acids Res 37(Web Server issue):W115–W121

    Article  PubMed  CAS  Google Scholar 

  6. Theocharidis A et al (2009) Network visualization and analysis of gene expression data using BioLayout Express(3D). Nat Protoc 4(10):1535–1550

    Article  PubMed  CAS  Google Scholar 

  7. Mellor JC et al (2002) Predictome: a database of putative functional links between proteins. Nucleic Acids Res 30(1):306–309

    Article  PubMed  CAS  Google Scholar 

  8. Cline MS et al (2007) Integration of biological networks and gene expression data using Cytoscape. Nat Protoc 2(10):2366–2382

    Article  PubMed  CAS  Google Scholar 

  9. Shannon P et al (2003) Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res 13(11):2498–2504

    Article  PubMed  CAS  Google Scholar 

  10. Lehne B, Schlitt T (2009) Protein–protein interaction databases: keeping up with growing interactomes. Hum Genomics 3(3):291–297

    PubMed  CAS  Google Scholar 

  11. Snel B et al (2000) STRING: a web-server to retrieve and display the repeatedly occurring neighbourhood of a gene. Nucleic Acids Res 28(18):3442–3444

    Article  PubMed  CAS  Google Scholar 

  12. Pena-Castillo L et al (2008) A critical assessment of Mus musculus gene function prediction using integrated genomic evidence. Genome Biol 9(Suppl 1):S2

    Article  PubMed  Google Scholar 

  13. Warde-Farley D et al (2010) The GeneMANIA prediction server: biological network integration for gene prioritization and predicting gene function. Nucleic Acids Res 38(Web Server issue):W214–W220

    Article  PubMed  CAS  Google Scholar 

  14. Apweiler R et al (2010) A large-scale protein-function database. Nat Chem Biol 6(11):785

    Article  PubMed  CAS  Google Scholar 

  15. Kiefer F et al (2009) The SWISS-MODEL repository and associated resources. Nucleic Acids Res 37(Database issue):D387–D392

    Article  PubMed  CAS  Google Scholar 

  16. Letunic I, Doerks T, Bork P (2009) SMART 6: recent updates and new developments. Nucleic Acids Res 37(Database issue):D229–D232

    Article  PubMed  CAS  Google Scholar 

  17. Heldin C-H, Miyazono K, ten Dijke P (1997) TGF-(beta) signalling from cell membrane to nucleus through SMAD proteins. Nature 390(6659):465–471

    Article  PubMed  CAS  Google Scholar 

  18. Montojo J et al (2010) GeneMANIA Cytoscape plugin: fast gene function predictions on the desktop. Bioinformatics 26(22):2927–2928

    Article  PubMed  CAS  Google Scholar 

  19. Barrett T et al (2011) NCBI GEO: archive for functional genomics data sets—10 years on. Nucleic Acids Res 39(Database issue):D1005–D1010

    Article  PubMed  CAS  Google Scholar 

  20. Stark C et al (2011) The BioGRID Interaction Database: 2011 update. Nucleic Acids Res 39(Database issue):D698–D704

    Article  PubMed  CAS  Google Scholar 

  21. Cerami EG et al (2011) Pathway Commons, a web resource for biological pathway data. Nucleic Acids Res 39(Database issue):D685–D690

    Article  PubMed  CAS  Google Scholar 

  22. Brown KR, Jurisica I (2005) Online predicted human interaction database. Bioinformatics 21(9):2076–2082

    Article  PubMed  CAS  Google Scholar 

  23. Alibes A et al (2007) IDconverter and IDClight: conversion and annotation of gene and protein IDs. BMC Bioinformatics 8:9

    Article  PubMed  Google Scholar 

  24. Mudunuri U et al (2009) bioDBnet: the biological database network. Bioinformatics 25(4):555–556

    Article  PubMed  CAS  Google Scholar 

  25. Razick S et al (2011) iRefScape. A Cytoscape plug-in for visualization and data mining of protein interaction data from iRefIndex. BMC Bioinformatics 12:388

    Article  PubMed  Google Scholar 

  26. Kerrien S et al (2012) The IntAct molecular interaction database in 2012. Nucleic Acids Res 40(Database issue):D841–D846

    Article  PubMed  CAS  Google Scholar 

  27. Alfarano C et al (2005) The Biomolecular Interaction Network Database and related tools 2005 update. Nucleic Acids Res 33(Database issue):D418–D424

    Article  PubMed  CAS  Google Scholar 

  28. Licata L et al (2012) MINT, the Molecular Interaction Database: 2012 update. Nucleic Acids Res 40(Database issue):D857–D861

    Article  PubMed  CAS  Google Scholar 

  29. Astsaturov I et al (2010) Synthetic lethal screen of an EGFR-centered network to improve targeted therapies. Sci Signal 3(140):ra67

    Article  PubMed  Google Scholar 

  30. Kanehisa M et al (2004) The KEGG resource for deciphering the genome. Nucleic Acids Res 32(Database issue):D277–D280

    Article  PubMed  CAS  Google Scholar 

  31. Croft D et al (2011) Reactome: a database of reactions, pathways and biological processes. Nucleic Acids Res 39(Database issue):D691–D697

    Article  PubMed  CAS  Google Scholar 

  32. Joshi-Tope G et al (2005) Reactome: a knowledgebase of biological pathways. Nucleic Acids Res 33(Database issue):D428–D432

    Article  PubMed  CAS  Google Scholar 

  33. Hernandez-Boussard T et al (2008) The pharmacogenetics and pharmacogenomics knowledge base: accentuating the knowledge. Nucleic Acids Res 36(Database issue):D913–D918

    PubMed  CAS  Google Scholar 

  34. Kandasamy K et al (2010) NetPath: a public resource of curated signal transduction pathways. Genome Biol 11(1):R3

    Article  PubMed  Google Scholar 

  35. Schaefer CF et al (2009) PID: the Pathway Interaction Database. Nucleic Acids Res 37(Database issue):D674–D679

    Article  PubMed  CAS  Google Scholar 

  36. Kelder T et al (2012) WikiPathways: building research communities on biological pathways. Nucleic Acids Res 40(Database issue):D1301–D1307

    Article  PubMed  CAS  Google Scholar 

  37. Barrett T et al (2007) NCBI GEO: mining tens of millions of expression profiles–database and tools update. Nucleic Acids Res 35(Database issue):D760–D765

    Article  PubMed  CAS  Google Scholar 

  38. Edgar R, Domrachev M, Lash AE (2002) Gene Expression Omnibus: NCBI gene expression and hybridization array data repository. Nucleic Acids Res 30(1):207–210

    Article  PubMed  CAS  Google Scholar 

  39. Brazma A et al (2003) ArrayExpress – a public repository for microarray gene expression data at the EBI. Nucleic Acids Res 31(1):68–71

    Article  PubMed  CAS  Google Scholar 

  40. Parkinson H et al (2007) ArrayExpress – a public database of microarray experiments and gene expression profiles. Nucleic Acids Res 35(Database issue):D747–D750

    Article  PubMed  CAS  Google Scholar 

  41. Parkinson H et al (2011) ArrayExpress update – an archive of microarray and high-throughput sequencing-based functional genomics experiments. Nucleic Acids Res 39(Database issue):D1002–D1004

    Article  PubMed  CAS  Google Scholar 

  42. Hibbs MA et al (2007) Exploring the functional landscape of gene expression: directed search of large microarray compendia. Bioinformatics 23(20):2692–2699

    Article  PubMed  CAS  Google Scholar 

  43. Lamb J et al (2006) The Connectivity Map: using gene-expression signatures to connect small molecules, genes, and disease. Science 313(5795):1929–1935

    Article  PubMed  CAS  Google Scholar 

  44. Yildirim MA et al (2007) Drug-target network. Nat Biotechnol 25(10):1119–1126

    Article  PubMed  CAS  Google Scholar 

  45. Kuhn M et al (2008) STITCH: interaction networks of chemicals and proteins. Nucleic Acids Res 36(Database issue):D684–D688

    PubMed  CAS  Google Scholar 

  46. Wang Y et al (2012) PubChem’s BioAssay Database. Nucleic Acids Res 40(Database issue):D400–D412

    Article  PubMed  CAS  Google Scholar 

  47. Cohen KB, Hunter L (2008) Getting started in text mining. PLoS Comput Biol 4(1):e20

    Article  PubMed  Google Scholar 

  48. Hoffmann R, Valencia A (2005) Implementing the iHOP concept for navigation of biomedical literature. Bioinformatics 21(Suppl 2):ii252–ii258

    Article  PubMed  CAS  Google Scholar 

  49. Plotnikova OV et al (2012) Calmodulin activation of Aurora-A kinase (AURKA) is required during ciliary disassembly and in mitosis. Mol Biol Cell 23(14):2658–2670

    Article  PubMed  CAS  Google Scholar 

  50. Orchard S (2012) Molecular interaction databases. Proteomics 12(10):1656–1662

    Article  PubMed  CAS  Google Scholar 

  51. Latendresse M, Paley S, Karp PD (2012) Browsing metabolic and regulatory networks with BioCyc. Methods Mol Biol 804:197–216

    Article  PubMed  CAS  Google Scholar 

  52. Keseler IM et al (2011) EcoCyc: a comprehensive database of Escherichia coli biology. Nucleic Acids Res 39(Database issue):D583–D590

    Article  PubMed  CAS  Google Scholar 

  53. Mathivanan S et al (2008) Human Proteinpedia enables sharing of human protein data. Nat Biotechnol 26(2):164–167

    Article  PubMed  CAS  Google Scholar 

  54. Kanehisa M et al (2006) From genomics to chemical genomics: new developments in KEGG. Nucleic Acids Res 34(Database issue):D354–D357

    Article  PubMed  CAS  Google Scholar 

  55. Ruepp A et al (2010) CORUM: the comprehensive resource of mammalian protein complexes–2009. Nucleic Acids Res 38(Database issue):D497–D501

    Article  PubMed  CAS  Google Scholar 

  56. Salwinski L et al (2004) The Database of Interacting Proteins: 2004 update. Nucleic Acids Res 32(Database issue):D449–D451

    Article  PubMed  CAS  Google Scholar 

  57. Guldener U et al (2006) MPact: the MIPS protein interaction resource on yeast. Nucleic Acids Res 34(Database issue):D436–D441

    Article  PubMed  Google Scholar 

  58. Pagel P et al (2005) The MIPS mammalian protein–protein interaction database. Bioinformatics 21(6):832–834

    Article  PubMed  CAS  Google Scholar 

  59. Brown KR, Jurisica I (2007) Unequal evolutionary conservation of human protein interactions in interologous networks. Genome Biol 8(5):R95

    Article  PubMed  Google Scholar 

  60. Jayapandian M et al (2007) Michigan molecular interactions (MiMI): putting the jigsaw puzzle together. Nucleic Acids Res 35(Database issue):D566–D571

    Article  PubMed  CAS  Google Scholar 

  61. Stelzl U et al (2005) A human protein–protein interaction network: a resource for annotating the proteome. Cell 122(6):957–968

    Article  PubMed  CAS  Google Scholar 

  62. Han JD et al (2004) Evidence for dynamically organized modularity in the yeast protein–protein interaction network. Nature 430(6995):88–93

    Article  PubMed  CAS  Google Scholar 

  63. Hunter S et al (2012) InterPro in 2011: new developments in the family and domain prediction database. Nucleic Acids Res 40(Database issue):D306–D312

    Article  PubMed  CAS  Google Scholar 

  64. Kim YJ et al (2005) miBLAST: scalable evaluation of a batch of nucleotide sequence queries with BLAST. Nucleic Acids Res 33(13):4335–4344

    Article  PubMed  CAS  Google Scholar 

  65. Wiwatwattana N et al (2007) Organelle DB: an updated resource of eukaryotic protein localization and function. Nucleic Acids Res 35(Database issue):D810–D814

    Article  PubMed  CAS  Google Scholar 

  66. Fischer S et al (2011) Using OrthoMCL to assign proteins to OrthoMCL-DB groups or to cluster proteomes into new ortholog groups. Curr Protoc Bioinformatics Chapter 6: Unit 6 12 1–19

    Google Scholar 

  67. Punta M et al (2012) The Pfam protein families database. Nucleic Acids Res 40(Database issue):D290–D301

    Article  PubMed  CAS  Google Scholar 

  68. Rappoport N et al (2012) ProtoNet 6.0: organizing 10 million protein sequences in a compact hierarchical family tree. Nucleic Acids Res 40(Database issue):D313–D320

    Article  PubMed  CAS  Google Scholar 

  69. Adams MD et al (2000) The genome sequence of Drosophila melanogaster. Science 287(5461):2185–2195

    Article  PubMed  Google Scholar 

  70. Walhout AJ, Vidal M (2001) Protein interaction maps for model organisms. Nat Rev Mol Cell Biol 2(1):55–62

    Article  PubMed  CAS  Google Scholar 

  71. Echeverria PC et al (2011) An interaction network predicted from public data as a discovery tool: application to the Hsp90 molecular chaperone machine. PLoS One 6(10):e26044

    Article  PubMed  CAS  Google Scholar 

  72. Sharan R et al (2005) Identification of protein complexes by comparative analysis of yeast and bacterial protein interaction data. J Comput Biol 12(6):835–846

    Article  PubMed  CAS  Google Scholar 

  73. Ulitsky I, Shamir R (2007) Pathway redundancy and protein essentiality revealed in the Saccharomyces cerevisiae interaction networks. Mol Syst Biol 3:104

    Article  PubMed  Google Scholar 

  74. Murali T et al (2011) DroID 2011: a comprehensive, integrated resource for protein, transcription factor, RNA and gene interactions for Drosophila. Nucleic Acids Res 39(Database issue):D736–D743

    Article  PubMed  CAS  Google Scholar 

  75. Yu J et al (2008) DroID: the Drosophila Interactions Database, a comprehensive resource for annotated gene and protein interactions. BMC Genomics 9:461

    Article  PubMed  Google Scholar 

  76. McQuilton P, St Pierre SE, Thurmond J (2012) FlyBase 101 – the basics of navigating FlyBase. Nucleic Acids Res 40(Database issue):D706–D714

    Article  PubMed  CAS  Google Scholar 

  77. Pacifico S et al (2006) A database and tool, IM Browser, for exploring and integrating emerging gene and protein interaction data for Drosophila. BMC Bioinformatics 7:195

    Article  PubMed  Google Scholar 

  78. Cherry JM et al (2012) Saccharomyces Genome Database: the genomics resource of budding yeast. Nucleic Acids Res 40(Database issue):D700–D705

    Article  PubMed  CAS  Google Scholar 

  79. Stein L et al (2001) WormBase: network access to the genome and biology of Caenorhabditis elegans. Nucleic Acids Res 29(1):82–86

    Article  PubMed  CAS  Google Scholar 

  80. Harris TW et al (2010) WormBase: a comprehensive resource for nematode research. Nucleic Acids Res 38(Database issue):D463–D467

    Article  PubMed  CAS  Google Scholar 

  81. Guan Y et al (2008) A genomewide functional network for the laboratory mouse. PLoS Comput Biol 4(9):e1000165

    Article  PubMed  Google Scholar 

  82. ten Dijke P, Arthur HM (2007) Extracellular control of TGFbeta signalling in vascular development and disease. Nat Rev Mol Cell Biol 8(11):857–869

    Article  PubMed  Google Scholar 

Download references

Acknowledgments

The authors were supported by U54 CA149147, R01 CA63366, and P50 CA083638 from the NIH (to EAG), postdoctoral fellowship from SASS Foundation for Medical Research and Ann Schreiber Program of Excellence Grant from the Ovarian Cancer Research Fund (to HL), Drexel University College of Medicine MD-PhD Program (to TNB), and NIH core grant CA06927 (to Fox Chase Cancer Center).

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer Science+Business Media, LLC

About this protocol

Cite this protocol

Liu, H., Beck, T.N., Golemis, E.A., Serebriiskii, I.G. (2014). Integrating In Silico Resources to Map a Signaling Network. In: Ochs, M. (eds) Gene Function Analysis. Methods in Molecular Biology, vol 1101. Humana Press, Totowa, NJ. https://doi.org/10.1007/978-1-62703-721-1_11

Download citation

  • DOI: https://doi.org/10.1007/978-1-62703-721-1_11

  • Published:

  • Publisher Name: Humana Press, Totowa, NJ

  • Print ISBN: 978-1-62703-720-4

  • Online ISBN: 978-1-62703-721-1

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics