A Guide to Protein Interaction Databases

  • Tiffany B. Fischer
  • Melissa Paczkowski
  • Michael F. Zettel
  • Jerry Tsai
Part of the Springer Protocols Handbooks book series (SPH)


With the continual completion of new genomes, discovering the purpose of all these nucleotide sequences takes on greater implications. Proteins, the predominant products of gene sequences, have been a natural place to start, and projects have begun characterizing all the proteins from a genome (the proteome). One particular aspect of the proteome that has been amenable to large-scale bioinformatics studies is the identification of interacting proteins and the mapping of protein interaction networks (a complete set for a genome is commonly referred to as the interactome). Because studies of protein interaction produce large amounts of data, the challenge has become how to present such data sets in a meaningful and informative manner, so that they are a resource for the general biological community. The typical scientific medium to share information is publication in a journal article. As a physical medium of paper and text, publications are inadequate at presenting such large sets of information and are limited to an overview and some general conclusions about the data. Instead, presentation of these large data sets has taken advantage of the relational capabilities provided by computers and broad accessibility provided by the Internet. These protein interaction data sets are stored in a database, enabling simple implementations of search and browse functions, and are presented on the Internet with a World-Wide Web front end. As the amount of the protein interaction data has increased, so too has the number and variety of databases. This chapter is intended for the general research community as a guide to increase the visibility and accessibility of these information resources on protein interactions.


Protein Data Bank Interaction Database Result Page Human Protein Reference Database Major Histocompatability Complex 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Zhu, H., Bilgin, M., and Snyder, M. Proteomics. (2003) Annu. Rev. Biochem. 72, 783–812.PubMedCrossRefGoogle Scholar
  2. 2.
    Ito, T. et al. (2002) Roles for the two-hybrid system in exploration of the yeast protein interactome. Mol. Cell. Proteomics 1, 561–566.PubMedCrossRefGoogle Scholar
  3. 3.
    Fodor, S. P. et al. (1993) Multiplexed biochemical assays with biological chips. Nature 364, 555–556.PubMedCrossRefGoogle Scholar
  4. 4.
    Schena, M. et al. (1996) Parallel human genome analysis: microarray-based expression monitoring of 1000 genes. Proc. Natl. Acad. Sci. USA 93, 10,614–10,619.PubMedCrossRefGoogle Scholar
  5. 5.
    Fields, S., and Song, O. (1989) A novel genetic system to detect protein-protein interactions. Nature 340, 245–246.PubMedCrossRefGoogle Scholar
  6. 6.
    O’Farrell, P. Z., Goodman, H. M., and O’Farrell, P. H. (1977) High resolution two-dimensional electrophoresis of basic as well as acidic proteins. Cell 12, 1133–1141.CrossRefGoogle Scholar
  7. 7.
    Figeys, D., McBroom, L. D., and Moran, M. F. (2001) Mass spectrometry for the study of protein-protein interactions. Methods 24, 230–239.PubMedCrossRefGoogle Scholar
  8. 8.
    Yandell, M. D. and Majoros, W. H. (2002) Genomics and natural language processing. Nat. Rev. Genet. 3, 601–610.PubMedCrossRefGoogle Scholar
  9. 9.
    Hirschman, L., Park, J. C., Tsujii, J., Wong, L., and Wu, C. H. (2002) Accomplishments and challenges in literature data mining for biology. Bioinformatics 18, 1553–1561.PubMedCrossRefGoogle Scholar
  10. 10.
    Berman, H. M. et al. (2002) The Protein Data Bank. Acta Crystallogr. D. Biol. Crystallogr. 58, 899–907.PubMedCrossRefGoogle Scholar
  11. 11.
    Marino-Ramirez, L., Campbell, L., and Hu, J. C. (2003) Screening peptide/protein libraries fused to the lambda repressor DNA-binding domain in E. coli cells. Methods Mol. Biol. 205, 235–250.PubMedGoogle Scholar
  12. 12.
    Marcotte, E. M., Xenarios, I., and Eisenberg, D. (2001) Mining literature for protein-protein interactions. Bioinformatics 17, 359–363.PubMedCrossRefGoogle Scholar
  13. 13.
    McDermott, J. and Samudrala, R. (2003) Bioverse: Functional, structural and contextual annotation of proteins and proteomes. Nucleic Acids Res. 31, 3736–3737.PubMedCrossRefGoogle Scholar
  14. 14.
    Nanao, M. H., Zhou, W., Pfaffinger, P. J., and Choe, S. (2003) Determining the basis of channel-tetramerization specificity by x-ray crystallography and a sequence-comparison algorithm: Family Values (FamVal). Proc. Natl. Acad. Sci. USA 100, 8670–8675.PubMedCrossRefGoogle Scholar
  15. 15.
    Puntervoll, P. et al. (2003) ELM server: A new resource for investigating short functional sites in modular eukaryotic proteins. Nucleic Acids Res. 31, 3625–3630.PubMedCrossRefGoogle Scholar
  16. 16.
    Kim, W. K., Park, J., and Suh, J. K. (2002) Large scale statistical prediction of proteinprotein interaction by potentially interacting domain (PID) pair. Genome Inform. Ser. Workshop Genome Inform. 13, 42–50.Google Scholar
  17. 17.
    Deane, C. M., Salwinski, L., Xenarios, I., and Eisenberg, D. (2002) Protein interactions: two methods for assessment of the reliability of high throughput observations. Mol. Cell. Proteomics 1, 349–356.PubMedCrossRefGoogle Scholar
  18. 18.
    Xenarios, I. et al. (2002) DIP, the Database of Interacting Proteins: a research tool for studying cellular networks of protein interactions. Nucleic Acids Res. 30, 303–305.PubMedCrossRefGoogle Scholar
  19. 19.
    Xenarios, I. et al. (2000) DIP: the database of interacting proteins. Nucleic Acids Res. 28, 289–291.PubMedCrossRefGoogle Scholar
  20. 20.
    Hodges, P. E., Payne, W. E., and Garrels, J. I. (1998) The Yeast Protein Database (YPD): a curated proteome database for Saccharomyces cerevisiae. Nucleic Acids Res. 26, 68–72.PubMedCrossRefGoogle Scholar
  21. 21.
    Karp, P. D. et al. (2002) The EcoCyc Database. Nucleic Acids Res. 30, 56–58.PubMedCrossRefGoogle Scholar
  22. 22.
    Giot, L. et al. (2003) A protein interaction map of Drosophila melanogaster. Science 302(5651), 1727–1736.PubMedCrossRefGoogle Scholar
  23. 23.
    Kanehisa, M. (2002) The KEGG database. Novartis Found. Symp. 247, 91–101; discussion 101-103, 119-128, 244-252.PubMedCrossRefGoogle Scholar
  24. 24.
    Takai-Igarashi, T., Nadaoka, Y., and Kaminuma, T. (1998) A database for cell signaling networks. J. Comput. Biol. 5, 747–754.PubMedCrossRefGoogle Scholar
  25. 25.
    Bader, G. D., Betel, D., and Hogue, C. W. (2003) BIND: the Biomolecular Interaction Network Database. Nucleic Acids Res. 31, 248–250.PubMedCrossRefGoogle Scholar
  26. 26.
    Bader, G. D. and Hogue, C. W. (2000) BIND-a data specification for storing and describing biomolecular interactions, molecular complexes and pathways. Bioinformatics 16, 465–477.PubMedCrossRefGoogle Scholar
  27. 27.
    Zanzoni, A. et al. (2002) MINT: a Molecular INTeraction database. FEBS Lett. 513, 135–140.PubMedCrossRefGoogle Scholar
  28. 28.
    Orchard, S., Hermjakob, H., and Apweiler, R. (2003) The proteomics standards initiative. Proteomics 3, 1374–1376.PubMedCrossRefGoogle Scholar
  29. 29.
    Ji, Z. L. et al. (2003) KDBI: Kinetic Data of Bio-molecular Interactions database. Nucleic Acids Res. 31, 255–257.PubMedCrossRefGoogle Scholar
  30. 30.
    Chen, X., Lin, Y., Liu, M., and Gilson, M. K. (2002) The Binding Database: data management and interface design. Bioinformatics 18, 130–139.PubMedCrossRefGoogle Scholar
  31. 31.
    Chen, X., Liu, M., and Gilson, M. K. (2001) BindingDB: a Web-accessible molecular recognition database. Comb. Chem. High Throughput Screen. 4, 719–725.PubMedGoogle Scholar
  32. 32.
    Guerois, R., Nielsen, J. E., and Serrano, L. (2002) Predicting changes in the stability of proteins and protein complexes: a study of more than 1000 mutations. J. Mol. Biol. 320, 369–387.PubMedCrossRefGoogle Scholar
  33. 33.
    Carter, C. W., Jr., LeFebvre, B. C., Cammer, S. A., Tropsha, A., and Edgell, M. H. (2001) Four-body potentials reveal protein-specific correlations to stability changes caused by hydrophobic core mutations. J. Mol. Biol. 311, 625–638.PubMedCrossRefGoogle Scholar
  34. 34.
    Kwasigroch, J. M., Gilis, D., Dehouck, Y., and Rooman, M. (2002) PoPMuSiC, rationally designing point mutations in protein structures. Bioinformatics 18, 1701–1702.PubMedCrossRefGoogle Scholar
  35. 35.
    Chen, X., Lin, Y., and Gilson, M. K. (2001) The binding database: overview and user’s guide. Biopolymers 61, 127–141.PubMedCrossRefGoogle Scholar
  36. 36.
    Mewes, H. W. et al. (2002) MIPS: a database for genomes and protein sequences. Nucleic Acids Res. 30, 31–34.PubMedCrossRefGoogle Scholar
  37. 37.
    Peri, S. et al. (2003) Development of human protein reference database as an initial platform for approaching systems biology in humans. Genome Res. 13, 2363–2371.PubMedCrossRefGoogle Scholar
  38. 38.
    Kikuno, R., Nagase, T., Waki, M., and Ohara, O. (2002) HUGE: a database for human large proteins identified in the Kazusa cDNA sequencing project. Nucleic Acids Res. 30, 166–168.PubMedCrossRefGoogle Scholar
  39. 39.
    Kikuno, R. et al. (2000) HUGE: a database for human large proteins identified in the Kazusa cDNA sequencing project. Nucleic Acids Res 28, 331–2.PubMedCrossRefGoogle Scholar
  40. 40.
    DeLano, W. L. (2002) Unraveling hot spots in binding interfaces: progress and challenges. Curr. Opin. Struct. Biol. 12, 14–20.PubMedCrossRefGoogle Scholar
  41. 43.
    Thorn, K. S. and Bogan, A. A. (2001) ASEdb: a database of alanine mutations and their effects on the free energy of binding in protein interactions. Bioinformatics 17, 284–285.PubMedCrossRefGoogle Scholar
  42. 44.
    Bogan, A. A. and Thorn, K. S. (1998) Anatomy of hot spots in protein interfaces. J. Mol. Biol. 280, 1–9.PubMedCrossRefGoogle Scholar
  43. 45.
    Fischer, T. B. et al. (2003) The binding interface database (BID): a compilation of amino acid hot spots in protein interfaces. Bioinformatics 19, 1453–1454.PubMedCrossRefGoogle Scholar
  44. 46.
    Breitkreutz, B. J., Stark, C., and Tyers, M. (2003) The GRID: the General Repository for Interaction Datasets. Genome Biol. 4, R23.PubMedCrossRefGoogle Scholar
  45. 47.
    Breitkreutz, B. J., Stark, C., and Tyers, M. (2002) The GRID: The General Repository for Interaction Datasets. Genome Biol. 3, PREPRINT0013.Google Scholar
  46. 48.
    Breitkreutz, B. J., Stark, C., and Tyers, M. (2003) Osprey: a network visualization system. Genome Biol. 4, R22.PubMedCrossRefGoogle Scholar
  47. 49.
    Breitkreutz, B. J., Stark, C., and Tyers, M. (2002) Osprey: a network visualization system. Genome Biol. 3, PREPRINT0012.Google Scholar
  48. 50.
    Sarai, A. et al. (2001) Thermodynamic databases for proteins and protein-nucleic acid interactions. Biopolymers 61, 121–126.PubMedCrossRefGoogle Scholar
  49. 51.
    Prabakaran, P. et al. (2001) Thermodynamic database for protein-nucleic acid interactions (ProNIT). Bioinformatics 17, 1027–1034.PubMedCrossRefGoogle Scholar
  50. 52.
    Heinemeyer, T. et al. (1999) Expanding the TRANSFAC database towards an expert system of regulatory molecular mechanisms. Nucleic Acids Res. 27, 318–322.PubMedCrossRefGoogle Scholar
  51. 53.
    Salgado, H. et al. (2001) RegulonDB (version 3.2): transcriptional regulation and operon organization in Escherichia coli K-12. Nucleic Acids Res. 29, 72–74.PubMedCrossRefGoogle Scholar
  52. 54.
    Puvanendrampillai, D., and Mitchell, J. B. (2003) L/D Protein Ligand Database (PLD): additional understanding of the nature and specificity of protein-ligand complexes. Bioinformatics 19, 1856–1857.PubMedCrossRefGoogle Scholar
  53. 55.
    Orengo, C. A. et al. (2002) The CATH protein family database: a resource for structural and functional annotation of genomes. Proteomics 2, 11–21.PubMedCrossRefGoogle Scholar
  54. 56.
    Laskowski, R. A. (2001) PDBsum: summaries and analyses of PDB structures. Nucleic Acids Res. 29, 221–222.PubMedCrossRefGoogle Scholar
  55. 57.
    Laskowski, R. A. et al. (1997) PDBsum: a Web-based database of summaries and analyses of all PDB structures. Trends Biochem. Sci. 22, 488–490.PubMedCrossRefGoogle Scholar
  56. 58.
    Ellis, L. B., Hershberger, C. D., Bryan, E. M., and Wackett, L. P. (2001) The University of Minnesota Biocatalysis/Biodegradation Database: emphasizing enzymes. Nucleic Acids Res. 29, 340–343.PubMedCrossRefGoogle Scholar
  57. 59.
    Ellis, L. B., Hershberger, C. D., and Wackett, L. P. (1999) The University of Minnesota Biocatalysis/Biodegradation Database: specialized metabolism for functional genomics. Nucleic Acids Res. 27, 373–376.PubMedCrossRefGoogle Scholar
  58. 60.
    Ellis, L. B., Hershberger, C. D., and Wackett, L. P. (2000) The University of Minnesota Biocatalysis/Biodegradation database: microorganisms, genomics and prediction. Nucleic Acids Res. 28, 377–379.PubMedCrossRefGoogle Scholar
  59. 61.
    Ellis, L. B., Hou, B. K., Kang, W., and Wackett, L. P. (2003) The University of Minnesota Biocatalysis/Biodegradation Database: post-genomic data mining. Nucleic Acids Res. 31, 262–265.PubMedCrossRefGoogle Scholar
  60. 62.
    Pharkya, P., Nikolaev, E. V., and Maranas, C. D. (2003) Review of the BRENDA Database. Metab. Eng. 5, 71–73.PubMedCrossRefGoogle Scholar
  61. 63.
    Govindarajan, K. R., Kangueane, P., Tan, T. W., and Ranganathan, S. (2003) MPID: MHCPeptide Interaction Database for sequence-structure-function information on peptides binding to MHC molecules. Bioinformatics 19, 309–310.PubMedCrossRefGoogle Scholar
  62. 64.
    Laskowski, R. A. (1995) SURFNET: a program for visualizing molecular surfaces, cavities, and intermolecular interactions. J. Mol. Graph. 13, 323–30, 307-308.PubMedCrossRefGoogle Scholar
  63. 66.
    Wallace, A. C., Laskowski, R. A., and Thornton, J. M. (1995) LIGPLOT: a program to generate schematic diagrams of protein-ligand interactions. Protein Eng. 8, 127–134.PubMedCrossRefGoogle Scholar
  64. 67.
    Pandey, A. (2001) Common standards for genomics and proteomics. Trends Genet. 17, 442Google Scholar

Copyright information

© Humana Press Inc., Totowa, NJ 2005

Authors and Affiliations

  • Tiffany B. Fischer
    • 1
  • Melissa Paczkowski
    • 1
  • Michael F. Zettel
    • 1
  • Jerry Tsai
    • 1
  1. 1.Department of Biochemistry and BiophysicsTexas A&M University

Personalised recommendations