Scoring Protein-Protein Interactions Using the Width of Gene Ontology Terms and the Information Content of Common Ancestors

  • Guangyu Cui
  • Kyungsook Han
Part of the Communications in Computer and Information Science book series (CCIS, volume 375)


Several methods have been proposed to measure the semantic similarity of proteins. In particular, the Gene Ontology (GO) is often used to estimate the semantic similarity of proteins annotated with GO terms since it provides the largest and reliable vocabulary of gene products and their characteristics. We developed a new measure for semantic similarity of proteins involved in protein-protein interactions using the width of GO terms and the information content of their common ancestors in the GO hierarchy. A comparative evaluation of our method with other GO-based similarity measures showed that our method outperformed the others in most GO domains.


semantic similarity protein-protein interactions width of GO terms 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Baclawski, K., Niu, T.: Ontologies for Bioinformatics (Computational Molecular Biology). The MIT Press, Cambridge (2005)Google Scholar
  2. 2.
    Harris, M.A., Clark, J., Ireland, A., et al.: The gene ontology (go) database and informatics resource. Nucleic Acids Research 32, D258–D261 (2004)CrossRefGoogle Scholar
  3. 3.
    Resnik, P.: Using information content to evaluate semantic similarity in a taxonomy. In: 14th Int. Joint Conf. Artificial Intelligence, pp. 448–453. Morgan Kaufmann Publishers, San Francisco (1995)Google Scholar
  4. 4.
    Jain, S., Bader, G.D.: An improved method for scoring protein-protein interactions using semantic similarity within the gene ontology. BMC Bioinformatics 11, 562 (2010)CrossRefGoogle Scholar
  5. 5.
    Lin, D.: An Information-theoretic Definition of Similarity. In: 15th Int. Conf. Machine Learning, pp. 296–304. Morgan Kaufmann Publishers, San Francisco (1998)Google Scholar
  6. 6.
    Jiang, J.J., Conrath, D.W.: Semantic similarity based on corpus statistics and lexical taxonomy. In: Int. Conf. Research in Computational Linguistics, Tapei, Taiwan, pp. 19–33 (1997)Google Scholar
  7. 7.
    Pesquita, C., Faria, D., Couto, F.M.: Measuring coherence between electronic and manual annotations in biological databases. In: ACM Symposium on Applied Computing, pp. 806–807 (2009)Google Scholar
  8. 8.
    Yu, H., Jansen, R., Stolovitzky, G., Gerstein, M.: Total ancestry measure: quantifying the similarity in tree-like classification, with genomic applications. Bioinformatics 23, 2163–2173 (2007)CrossRefGoogle Scholar
  9. 9.
    Al-Mubaid, H., Nagar, A.: Comparison of four similarity measures based on GO annotations for Gene Clustering. Report no. 3. In: IEEE Symp. Computers and Communications, pp. 531–536 (2008)Google Scholar
  10. 10.
    Zhang, S.M., Chen, J.W., Wang, B.Y.: The research of semantic similarity algorithm consideration of multi-factor ontology-based in access control. In: Int. Conf. Computer Application and System Modeling, pp. v3-538–v3-542 (2010)Google Scholar
  11. 11.
    Ashburner, M., Ball, C.A., Blake, J.A., Botstein, D., Butler, H., Cherry, J.M., Davis, A.P., Dolinski, K., Dwight, S.S., Eppig, J.T., Harris, M.A., Hill, D.P., Issel-Tarver, L., Kasarskis, A., Lewis, S., Matese, J.C., Richardson, J.E., Ringwald, M., Rubin, G.M., Sherlock, G.: Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat. Genet. 25, 25–29 (2000)Google Scholar
  12. 12.
    Saccharomyces Genome Database (2010),
  13. 13.
    The UniProt Consortium: The Universal Protein Resource (UniProt) in 2010. Nucleic Acids Research 38, D142–D148 (2010)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Guangyu Cui
    • 1
  • Kyungsook Han
    • 1
  1. 1.School of Computer Science and EngineeringInha UniversityIncheonKorea

Personalised recommendations