Improving Classification Accuracy Using Gene Ontology Information

  • Ying Shen
  • Lin Zhang
Part of the Communications in Computer and Information Science book series (CCIS, volume 375)


Classification problems, e.g., gene function prediction problem, are very important in bioinformatics. Previous work mainly focuses on the improvement of classification techniques used. With the emergence of Gene Ontology (GO), extra knowledge about the gene products can be extracted from GO. Such kind of knowledge reveals the relationship of the gene products and is helpful for solving the classification problems. In this paper, we propose a new method to integrate the knowledge from GO into classifiers. The results from the experiments demonstrate the efficacy of our new method.


Gene Ontology Semantic Similarity Distance Metric Learning 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Asuncion, A., Newman, D.J.: UCI Machine Learning Repository,
  2. 2.
    Brown, M., Grundy, W., Lin, D., et al.: Knowledge-based Analysis of Microarray Gene Expression Data by Using Support Vector Machines. PNAS 97, 262–267 (2000)CrossRefGoogle Scholar
  3. 3.
    Guyon, I., Weston, J., Barnhill, S., et al.: Gene Selection for Cancer Classification Using Support Vector Machines. Machine Learning 46, 389–422 (2002)zbMATHCrossRefGoogle Scholar
  4. 4.
    Hinton, G., Goldberger, J., Roweis, S., et al.: Neighborhood Components Analysis. In: Proc. NIPS, pp. 513–520 (2004)Google Scholar
  5. 5.
    Weinberger, K., Blitzer, J., Saul, L.: Distance Metric Learning for Large Margin Nearest Neighbor Classification. In: Proc. NIPS (2006)Google Scholar
  6. 6.
    Pandey, G., Myers, C.L., Kuma, V.: Incorporating Functional Inter-relationships into Protein Function Prediction Algorithms. BMC Bioinformatics 10, 142–164 (2009)CrossRefGoogle Scholar
  7. 7.
    Tao, Y., Sam, L., Li, J., et al.: Information Theory Applied to The Sparse Gene Ontology Annotation Network to Predict Novel Gene Function. Bioinformatics 23, i529-i538 (2007)Google Scholar
  8. 8.
    Resnik, P.: Semantic Similarity in Taxonomy: An Information-based Measure and Its Application to Problems of Ambiguity in Natural Language. Journal of Artificial Intelligence Research 11, 95–130 (1999)zbMATHGoogle Scholar
  9. 9.
    The Gene Ontology Consortium: Gene Ontology: Tool for the Unification of Biology. Nature Genetics 25, 25–29 (2000)Google Scholar
  10. 10.
    Wang, J., Du, Z., Payattakool, R., et al.: A New Method to Measure the Semantic Similarity of GO Terms. Bioinformatics 23, 1274–1281 (2007)CrossRefGoogle Scholar
  11. 11.
    Wu, H., Su, Z., Mao, F., et al.: Prediction of Functional Modules Based on Comparative Genome Analysis and Gene Ontology Application. Nucleic Acids Research 33, 2822–2837 (2005)CrossRefGoogle Scholar
  12. 12.
    Xing, E., Ng, A., Jordan, M., et al.: Distance Metric Learning, with Application to Clustering with Side-information. In: Proc. NIPS, pp. 505–512 (2002)Google Scholar
  13. 13.
    Yu, G., Li, F., Qin, Y., et al.: GOSemSim: an R Package for Measuring Semantic Similarity Among GO Terms and Gene Products. Bioinformatics 26, 976–978 (2010)CrossRefGoogle Scholar
  14. 14.
    Yu, H., Gao, L., Tu, K., et al.: Broadly Predicting Specific Gene Functions with Expression Similarity. Gene. 352, 75–81 (2005)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Ying Shen
    • 1
  • Lin Zhang
    • 1
  1. 1.School of Software EngineeringTongji UniversityShanghaiChina

Personalised recommendations