Quantitative Biology

, Volume 4, Issue 4, pp 320–330 | Cite as

Performance measures in evaluating machine learning based bioinformatics predictors for classifications

  • Yasen Jiao
  • Pufeng Du



Many existing bioinformatics predictors are based on machine learning technology. When applying these predictors in practical studies, their predictive performances should be well understood. Different performance measures are applied in various studies as well as different evaluation methods. Even for the same performance measure, different terms, nomenclatures or notations may appear in different context.


We carried out a review on the most commonly used performance measures and the evaluation methods for bioinformatics predictors.


It is important in bioinformatics to correctly understand and interpret the performance, as it is the key to rigorously compare performances of different predictors and to choose the right predictor.


machine learning performance measures evaluation methods 


  1. 1.
    Eberwine, J., Sul, J.-Y., Bartfai, T. and Kim, J. (2014) The promise of single-cell sequencing. Nat. Methods, 11, 25–27CrossRefPubMedGoogle Scholar
  2. 2.
    Ashley, E. A. (2015) The precision medicine initiative: a new national effort. JAMA, 313, 2119–2120CrossRefPubMedGoogle Scholar
  3. 3.
    Chou, K.-C. (2009) Pseudo amino acid composition and its applications in bioinformatics, proteomics and system biology. Curr. Proteomics, 6, 262–274CrossRefGoogle Scholar
  4. 4.
    Chou, K.-C. (2015) Impacts of bioinformatics to medicinal chemistry. Med. Chem., 11, 218–234CrossRefPubMedGoogle Scholar
  5. 5.
    Jiao, Y.-S. and Du, P.-F. (2016) Predicting Golgi-resident protein types using pseudo amino acid compositions: approaches with positional specific physicochemical properties. J. Theor. Biol., 391, 35–42CrossRefPubMedGoogle Scholar
  6. 6.
    Wang, Y. and Zeng, J. (2013) Predicting drug-target interactions using restricted Boltzmann machines. Bioinformatics, 29, i126–i134CrossRefPubMedPubMedCentralGoogle Scholar
  7. 7.
    Lee, K., Byun, K., Hong,W., Chuang, H. Y., Pack, C. G., Bayarsaikhan, E., Paek, S. H., Kim, H., Shin, H. Y., Ideker, T., et al. (2013) Proteomewide discovery of mislocated proteins in cancer. Genome Res., 23, 1283–1294CrossRefPubMedPubMedCentralGoogle Scholar
  8. 8.
    Shao, J., Xu, D., Hu, L., Kwan, Y.W., Wang, Y., Kong, X. and Ngai, S. M. (2012) Systematic analysis of human lysine acetylation proteins and accurate prediction of human lysine acetylation through bi-relative adapted binomial score Bayes feature representation. Mol. Biosyst., 8, 2964–2973CrossRefPubMedGoogle Scholar
  9. 9.
    Libbrecht, M. W. and Noble, W. S. (2015) Machine learning applications in genetics and genomics. Nat. Rev. Genet., 16, 321–332CrossRefPubMedPubMedCentralGoogle Scholar
  10. 10.
    Kohavi, R. and Provost, F. (1998) Glossary of terms. Mach. Learn., 30, 271–274CrossRefGoogle Scholar
  11. 11.
    Simon P. (2013) Too Big to Ignore: The Business Case for Big Data. New Jersey: WileyGoogle Scholar
  12. 12.
    Fan, Y.-X., Zhang, Y. and Shen, H.-B. (2013) LabCaS: labeling calpain substrate cleavage sites from amino acid sequence using conditional random fields. Proteins, 81, 622–634CrossRefPubMedGoogle Scholar
  13. 13.
    Song, J., Tan, H., Shen, H., Mahmood, K., Boyd, S. E., Webb, G. I., Akutsu, T. and Whisstock, J. C. (2010) Cascleave: towards more accurate prediction of caspase substrate cleavage sites. Bioinformatics, 26, 752–760CrossRefPubMedGoogle Scholar
  14. 14.
    Chou, K.-C. and Shen, H.-B. (2008) Cell-PLoc: a package of Web servers for predicting subcellular localization of proteins in various organisms. Nat. Protoc., 3, 153–162CrossRefPubMedGoogle Scholar
  15. 15.
    Li X, Liu T, Tao P, Wang, C., Chen, L. (2015) A highly accurate protein structural class prediction approach using auto cross covariance transformation and recursive feature elimination. Comput. Biol. Chem., 59, 95–100CrossRefPubMedGoogle Scholar
  16. 16.
    Kong, L., Zhang, L. and Lv, J. (2014) Accurate prediction of protein structural classes by incorporating predicted secondary structure information into the general form of Chou’s pseudo amino acid composition. J. Theor. Biol., 344, 12–18CrossRefPubMedGoogle Scholar
  17. 17.
    Guo, S.-H., Deng, E.-Z., Xu, L.-Q., Ding, H., Lin, H., Chen, W. and Chou, K. C. (2014) iNuc-PseKNC: a sequence-based predictor for predicting nucleosome positioning in genomes with pseudo k-tuple nucleotide composition. Bioinformatics, 30, 1522–1529CrossRefPubMedGoogle Scholar
  18. 18.
    Xu, Y., Wen, X., Wen, L.-S., Wu, L. Y., Deng, N. Y. and Chou, K. C. (2014) iNitro-Tyr: prediction of nitrotyrosine sites in proteins with general pseudo amino acid composition. PLoS One, 9, e105018CrossRefGoogle Scholar
  19. 19.
    Xu, Y. and Chou, K.-C. (2016) Recent progress in predicting posttranslational modification sites in proteins. Curr. Top. Med. Chem., 16, 591–603CrossRefPubMedGoogle Scholar
  20. 20.
    Jiang, R., Tang, W., Wu, X. and Fu, W. (2009) A random forest approach to the detection of epistatic interactions in case-control studies. BMC Bioinformatics, 10, S65CrossRefPubMedPubMedCentralGoogle Scholar
  21. 21.
    Tang, W., Wu, X., Jiang, R. and Li, Y. (2009) Epistatic module detection for case-control studies: a Bayesian model with a Gibbs sampling strategy. PLoS Genet., 5, e1000464CrossRefGoogle Scholar
  22. 22.
    Wu, X., Jiang, R., Zhang, M. Q. and Li, S. (2008) Network-based global inference of human disease genes. Mol. Syst. Biol., 4, 189CrossRefPubMedPubMedCentralGoogle Scholar
  23. 23.
    Li, T., Du, P. and Xu, N. (2010) Identifying human kinase-specific protein phosphorylation sites by integrating heterogeneous information from various sources. PLoS One, 5, e15411CrossRefGoogle Scholar
  24. 24.
    Xue, Y., Liu, Z., Cao, J., Ma, Q., Gao, X., Wang, Q., Jin, C., Zhou, Y., Wen, L. and Ren, J. (2011) GPS 2.1: enhanced prediction of kinasespecific phosphorylation sites with an algorithm of motif length selection. Protein Eng. Des. Sel., 24, 255–260CrossRefPubMedGoogle Scholar
  25. 25.
    Zhao, Q., Xie, Y., Zheng, Y., Jiang, S., Liu, W., Mu, W., Liu, Z., Zhao, Y., Xue, Y. and Ren, J. (2014) GPS-SUMO: a tool for the prediction of sumoylation sites and SUMO-interaction motifs. Nucleic Acids Res., 42, W325–W330CrossRefPubMedPubMedCentralGoogle Scholar
  26. 26.
    Nanni, L., Brahnam, S. and Lumini, A. (2012) Combining multiple approaches for gene microarray classification. Bioinformatics, 28, 1151–1157CrossRefPubMedGoogle Scholar
  27. 27.
    Dong, X. and Weng, Z. (2013) The correlation between histone modifications and gene expression. Epigenomics, 5, 113–116CrossRefPubMedPubMedCentralGoogle Scholar
  28. 28.
    Dong, X., Greven, M. C., Kundaje, A., Djebali, S., Brown, J. B., Cheng, C., Gingeras, T. R., Gerstein, M., Guig, R., Birney, E., et al. (2012) Modeling gene expression using chromatin features in various cellular contexts. Genome Biol., 13, R53CrossRefPubMedPubMedCentralGoogle Scholar
  29. 29.
    Cheng, C., Shou, C., Yip, K. Y. and Gerstein, M. B. (2011) Genomewide analysis of chromatin features identifies histone modification sensitive and insensitive yeast transcription factors. Genome Biol., 12, R111CrossRefPubMedPubMedCentralGoogle Scholar
  30. 30.
    Huang, J., Marco, E., Pinello, L. and Yuan, G. C. (2015) Predicting chromatin organization using histone marks. Genome Biol., 16, 162CrossRefPubMedPubMedCentralGoogle Scholar
  31. 31.
    Bishop CM. (2006) Pattern Recognition and Machine Learning. New York: SpringerGoogle Scholar
  32. 32.
    Zhang, M.-L. and Zhou, Z.-H. (2007) ML-KNN: a lazy learning approach to multi-label learning. Pattern Recognit., 40, 2038–2048CrossRefGoogle Scholar
  33. 33.
    Chou, K.-C. (2013) Some remarks on predicting multi-label attributes in molecular biosystems. Mol. Biosyst., 9, 1092–1100CrossRefPubMedGoogle Scholar
  34. 34.
    Chou, K.-C. and Shen, H.-B. (2006) Hum-PLoc: a novel ensemble classifier for predicting human protein subcellular localization. Biochem. Biophys. Res. Commun., 347, 150–157CrossRefPubMedGoogle Scholar
  35. 35.
    Chou, K.-C., Wu, Z.-C. and Xiao, X. (2012) iLoc-Hum: using the accumulation-label scale to predict subcellular locations of human proteins with both single and multiple sites. Mol. Biosyst., 8, 629–641CrossRefPubMedGoogle Scholar
  36. 36.
    Du, P. and Li, Y. (2006) Prediction of protein submitochondria locations by hybridizing pseudo-amino acid composition with various physicochemical features of segmented sequence. BMC Bioinformatics, 7, 518CrossRefPubMedPubMedCentralGoogle Scholar
  37. 37.
    Du, P., Tian, Y. and Yan, Y. (2012) Subcellular localization prediction for human internal and organelle membrane proteins with projected gene ontology scores. J. Theor. Biol., 313, 61–67CrossRefPubMedGoogle Scholar
  38. 38.
    Lin, H., Deng, E.-Z., Ding, H., Chen, W. and Chou, K. C. (2014) iPro54-PseKNC: a sequence-based predictor for identifying sigma-54 promoters in prokaryote with pseudo k-tuple nucleotide composition. Nucleic Acids Res., 42, 12961–12972CrossRefPubMedPubMedCentralGoogle Scholar
  39. 39.
    Chou, K.-C. (2011) Some remarks on protein attribute prediction and pseudo amino acid composition. J. Theor. Biol., 273, 236–247CrossRefPubMedGoogle Scholar
  40. 40.
    Chou, K. C. and Zhang, C. T. (1995) Prediction of protein structural classes. Crit. Rev. Biochem. Mol. Biol., 30, 275–349CrossRefPubMedGoogle Scholar
  41. 41.
    Du, P., Li, T. andWang, X. (2011) Recent progress in predicting protein sub-subcellular locations. Expert Rev. Proteomics, 8, 391–404CrossRefPubMedGoogle Scholar
  42. 42.
    Hastie, T., Tibshirani, R. and Friedman, J. (2009) Model Assessment and Selection. In The Elements of Statistical Learning: Data Mining, Inference, and Prediction. 219–260, New York: Springer-VerlagCrossRefGoogle Scholar
  43. 43.
    Chou, K. C. (2001) Using subsite coupling to predict signal peptides. Protein Eng., 14, 75–79CrossRefPubMedGoogle Scholar
  44. 44.
    Chen, W., Feng, P., Ding, H., Lin, H. and Chou, K. C. (2015) iRNAMethyl: identifying N(6)-methyladenosine sites using pseudo nucleotide composition. Anal. Biochem., 490, 26–33CrossRefPubMedGoogle Scholar
  45. 45.
    Powers, D. M. W. (2011) Evaluation: from precision, recall and F-measure to ROC, informedness, markedness and correlation. Inter. J. Mach. Learn. Tech., 2, 37–63CrossRefGoogle Scholar
  46. 46.
    Li, J., Witten, D. M., Johnstone, I. M. and Tibshirani, R. (2012) Normalization, testing, and false discovery rate estimation for RNAsequencing data. Biostatistics, 13, 523–538CrossRefPubMedGoogle Scholar
  47. 47.
    Andreassen, O. A., Thompson, W. K., Schork, A. J., Ripke, S., Mattingsdal, M., Kelsoe, J. R., Kendler, K. S., O’Donovan, M. C., Rujescu, D., Werge, T., et al. (2013) Improved detection of common variants associated with schizophrenia and bipolar disorder using pleiotropy-informed conditional false discovery rate. PLoS Genet., 9, e1003455CrossRefPubMedPubMedCentralGoogle Scholar
  48. 48.
    Chen, J. J., Roberson, P. K. and Schell, M. J. (2010) The false discovery rate: a key concept in large-scale genetic studies. Cancer Control, 17, 58–62CrossRefPubMedGoogle Scholar
  49. 49.
    Brodersen, K. H., Ong, C. S., Stephan, K. E., Buhmann, J. M. (2010) The Balanced Accuracy and Its Posterior Distribution. In 2010 20th International Conference on Pattern Recognition (ICPR). 3121–3124CrossRefGoogle Scholar
  50. 50.
    Mower, J. P. (2005) PREP-Mt: predictive RNA editor for plant mitochondrial genes. BMC Bioinformatics, 6, 96CrossRefPubMedPubMedCentralGoogle Scholar
  51. 51.
    Dayarian, A., Romero, R., Wang, Z., Biehl, M., Bilal, E., Hormoz, S., Meyer, P., Norel, R., Rhrissorrakrai, K., Bhanot, G., et al. (2015) Predicting protein phosphorylation from gene expression: top methods from the IMPROVER Species Translation Challenge. Bioinformatics, 31, 462–470CrossRefPubMedGoogle Scholar
  52. 52.
    Matthews, B. W. (1975) Comparison of the predicted and observed secondary structure of T4 phage lysozyme. BBA–Protein Structure, 405, 442–451CrossRefGoogle Scholar
  53. 53.
    Saito, T. and Rehmsmeier, M. (2015) The precision-recall plot is more informative than the ROC plot when evaluating binary classifiers on imbalanced datasets. PLoS One, 10, e0118432Google Scholar
  54. 54.
    Davis, J. and Goadrich, M. (2006) The relationship between precisionrecall and ROC curves. In Proceedings of the 23rd international conference on Machine learning. 233–240, New York: the Association for Computing MachineryGoogle Scholar
  55. 55.
    Du, P. and Xu, C. (2013) Predicting multisite protein subcellular locations: progress and challenges. Expert Rev. Proteomics, 10, 227–237CrossRefPubMedGoogle Scholar
  56. 56.
    Tsoumakas, G., Katakis, I. and Vlahavas, I. (2010) Mining Multi-label Data. In Data Mining and Knowledge Discovery Handbook. 667–685, New York: Springer USGoogle Scholar
  57. 57.
    Tsoumakas, G. and Katakis, I. (2007) Multi-label classification: an overview. Int. J. Data Warehous. Min., 3, 1–13CrossRefGoogle Scholar
  58. 58.
    Sprenger, J., Fink, J. L. and Teasdale, R. D. (2006) Evaluation and comparison of mammalian subcellular localization prediction methods. BMC Bioinformatics, 7, S3CrossRefPubMedPubMedCentralGoogle Scholar
  59. 59.
    Bermingham, M. L., Pong-Wong, R., Spiliopoulou, A., Hayward, C., Rudan, I., Campbell, H., Wright, A. F., Wilson, J. F., Agakov, F., Navarro, P., et al. (2015) Application of high-dimensional feature selection: evaluation for genomic prediction in man. Sci. Rep., 5, 10312CrossRefPubMedPubMedCentralGoogle Scholar
  60. 60.
    Varma, S. and Simon, R. (2006) Bias in error estimation when using cross-validation for model selection. BMC Bioinformatics, 7, 91CrossRefPubMedPubMedCentralGoogle Scholar

Copyright information

© Higher Education Press and Springer-Verlag GmbH 2016

Authors and Affiliations

  1. 1.School of Computer Science and TechnologyTianjin UniversityTianjinChina

Personalised recommendations