Skip to main content

An Experimental Comparison of Hierarchical Bayes and True Path Rule Ensembles for Protein Function Prediction

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 5997))

Abstract

The computational genome-wide annotation of gene functions requires the prediction of hierarchically structured functional classes and can be formalized as a multiclass, multilabel, multipath hierarchical classification problem, characterized by very unbalanced classes. We recently proposed two hierarchical protein function prediction methods: the Hierarchical Bayes (hbayes) and True Path Rule (tpr) ensemble methods, both able to reconcile the prediction of component classifiers trained locally at each term of the ontology and to control the overall precision-recall trade-off. In this contribution, we focus on the experimental comparison of the hbayes and tpr hierarchical gene function prediction methods and their cost-sensitive variants, using the model organism S. cerevisiae and the FunCat taxonomy. The results show that cost-sensitive variants of these methods achieve comparable results, and significantly outperform both flat and their non cost-sensitive hierarchical counterparts.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Friedberg, I.: Automated protein function prediction-the genomic challenge. Brief. Bioinformatics 7, 225–242 (2006)

    Article  Google Scholar 

  2. Pena-Castillo, L., et al.: A critical assessment of Mus musculus gene function prediction using integrated genomic evidence. Genome Biology 9(S1) (2008)

    Google Scholar 

  3. Guan, Y., Myers, C., Hess, D., Barutcuoglu, Z., Caudy, A., Troyanskaya, O.: Predicting gene function in a hierarchical context with an ensemble of classifiers. Genome Biology 9(S2) (2008)

    Google Scholar 

  4. Sokolov, A., Ben-Hur, A.: A structured-outputs method for prediction of protein function. In: MLSB 2008, the Second International Workshop on Machine Learning in Systems Biology (2008)

    Google Scholar 

  5. Astikainen, K., Holm, L., Pitkanen, E., Szedmak, S., Rousu, J.: Towards structured output prediction of enzyme function. BMC Proceedings 2(suppl. 4:S2) (2008)

    Google Scholar 

  6. Obozinski, G., Lanckriet, G., Grant, C., Jordan, M.I., Noble, W.S.: Consistent probabilistic output for protein function prediction. Genome Biology 9(S6) (2008)

    Google Scholar 

  7. Jiang, X., Nariai, N., Steffen, M., Kasif, S., Kolaczyk, E.: Integration of relational and hierarchical network information for protein function prediction. BMC Bioinformatics 9(350) (2008)

    Google Scholar 

  8. Cesa-Bianchi, N., Valentini, G.: Hierarchical cost-sensitive algorithms for genome-wide gene function prediction. Journal of Machine Learning Research, W&C Proceedings (to appear)

    Google Scholar 

  9. Valentini, G.: True path rule hierarchical ensembles. In: Benediktsson, J.A., Kittler, J., Roli, F. (eds.) MCS 2009. LNCS, vol. 5519, pp. 232–241. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  10. Ruepp, A., Zollner, A., Maier, D., Albermann, K., Hani, J., Mokrejs, M., Tetko, I., Guldener, U., Mannhaupt, G., Munsterkotter, M., Mewes, H.: The FunCat, a functional annotation scheme for systematic classification of proteins from whole genomes. Nucleic Acids Research 32(18), 5539–5545 (2004)

    Article  Google Scholar 

  11. The Gene Ontology Consortium: Gene ontology: tool for the unification of biology. Nature Genet. 25, 25–29 (2000)

    Google Scholar 

  12. Cesa-Bianchi, N., Gentile, C., Tironi, A., Zaniboni, L.: Incremental algorithms for hierarchical classification. In: Advances in Neural Information Processing Systems, vol. 17, pp. 233–240. MIT Press, Cambridge (2005)

    Google Scholar 

  13. Cesa-Bianchi, N., Gentile, C., Zaniboni, L.: Hierarchical classification: Combining Bayes with SVM. In: Proc. of the 23rd Int. Conf. on Machine Learning, pp. 177–184. ACM Press, New York (2006)

    Chapter  Google Scholar 

  14. Gene Ontology Consortium: True path rule (2009), http://www.geneontology.org/GO.usage.shtml#truePathRule

  15. Valentini, G., Re, M.: Weighted True Path Rule: a multilabel hierarchical algorithm for gene function prediction. In: MLD-ECML 2009, 1st International Workshop on learning from Multi-Label Data, Bled, Slovenia, pp. 133–146 (2009)

    Google Scholar 

  16. Valentini, G.: True Path Rule hierarchical ensembles for genome-wide gene function prediction. IEEE ACM Trans. on Comp. Biol. and Bioinformatics (in press)

    Google Scholar 

  17. Lin, H., Lin, C., Weng, R.: A note on Platt’s probabilistic outputs for support vector machines. Machine Learning 68, 267–276 (2007)

    Article  Google Scholar 

  18. Verspoor, K., Cohn, J., Mnizewski, S., Joslyn, C.: A categorization approach to automated ontological function annotation. Protein Science 15, 1544–1549 (2006)

    Article  Google Scholar 

  19. Demsar, J.: Statistical comparisons of classifiers over multiple data sets. Journal of Machine Learning Research 7, 1–30 (2006)

    MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Re, M., Valentini, G. (2010). An Experimental Comparison of Hierarchical Bayes and True Path Rule Ensembles for Protein Function Prediction. In: El Gayar, N., Kittler, J., Roli, F. (eds) Multiple Classifier Systems. MCS 2010. Lecture Notes in Computer Science, vol 5997. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-12127-2_30

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-12127-2_30

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-12126-5

  • Online ISBN: 978-3-642-12127-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics