Ensemble of classifier chains and Credal C4.5 for solving multi-label classification

  • S. Moral-García
  • Carlos J. MantasEmail author
  • Javier G. Castellano
  • Joaquín Abellán
Regular Paper


In this work, we have considered the ensemble of classifier chains (ECC) algorithm in order to solve the multi-label classification (MLC) task. It starts from binary relevance algorithm (BR), a simple and direct approach to MLC that has been shown to provide good results in practice. Nevertheless, unlike BR, ECC aims to exploit the correlations between labels. ECC uses an algorithm of traditional supervised classification in order to approach the binary problems. Within this field, Credal C4.5 (CC4.5) is a new version of the well-known C4.5 algorithm that uses imprecise probabilities in order to estimate the probability distribution of the class variable. This new version of C4.5 algorithm has been shown to provide better performance when noisy datasets are classified. In MLC, the intrinsic noise might be higher than in traditional supervised classification. The reason is very simple: in MLC, there are multiple labels, whereas in traditional classification there is just a class variable. Thus, there is more probability of error for an instance. For the previous reasons, the performance of ECC with CC4.5 as base classifier is studied in this work. We have carried out an extensive experimental analysis with several multi-label datasets, different noise levels and a large number of evaluation metrics for MLC. This experimental study has shown that, generally, ECC has better performance with CC4.5 as base classifier than using C4.5. The higher is the label noise level introduced in the data, the more significative is this improvement. Therefore, it is probably suitable to use imprecise probabilities in Decision Trees within MLC.


Multi-label classification Ensemble of classifier chains Credal C4.5 C4.5 Imprecise probabilities Noise 



This work has been supported by the Spanish “Ministerio de Economía y Competitividad” and by “Fondo Europeo de Desarrollo Regional” (FEDER) under Project TEC2015-69496-R.


  1. 1.
    Abellán, J.: Uncertainty measures on probability intervals from the imprecise dirichlet model. Int. J. Gen. Syst. 35(5), 509–528 (2006). MathSciNetCrossRefzbMATHGoogle Scholar
  2. 2.
    Abellán, J.: Ensembles of decision trees based on imprecise probabilities and uncertainty measures. Inf. Fusion 14(4), 423–430 (2013)CrossRefGoogle Scholar
  3. 3.
    Abellán, J., Mantas, C.J.: Improving experimental studies about ensembles of classifiers for bankruptcy prediction and credit scoring. Expert Syst. Appl. 41(8), 3825–3830 (2014). CrossRefGoogle Scholar
  4. 4.
    Abellán, J., Masegosa, A.: An experimental study about simple decision trees for bagging ensemble on datasets with classification noise. In: Sossai, C., Chemello, G. (eds.) Symbolic and Quantitative Approaches to Reasoning with Uncertainty, vol. 5590, pp. 446–456. Springer, Berlin (2009). CrossRefGoogle Scholar
  5. 5.
    Abellán, J., Moral, S.: Building classification trees using the total uncertainty criterion. Int. J. Intell. Syst. 18(12), 1215–1225 (2003). CrossRefzbMATHGoogle Scholar
  6. 6.
    Alves, R.T., Delgado, M.R., Freitas, A.A.: Knowledge discovery with artificial immune systems for hierarchical multi-label classification of protein functions. In: International Conference on Fuzzy Systems, pp. 1–8 (2010).
  7. 7.
    Barutcuoglu, Z., Schapire, R.E., Troyanskaya, O.G.: Hierarchical multi-label prediction of gene function. Bioinformatics 22(7), 830–836 (2006). CrossRefGoogle Scholar
  8. 8.
    Boutell, M.R., Luo, J., Shen, X., Brown, C.M.: Learning multi-label scene classification. Pattern Recognit. 37(9), 1757–1771 (2004). CrossRefGoogle Scholar
  9. 9.
    Briggs, F., Huang, Y., Raich, R., Eftaxias, K., Lei, Z., Cukierski, W., Hadley, S.F., Hadley, A., Betts, M., Fern, X.Z., Irvine, J., Neal, L., Thomas, A., Fodor, G., Tsoumakas, G., Ng, H.W., Nguyen, T.N.T., Huttunen, H., Ruusuvuori, P., Manninen, T., Diment, A., Virtanen, T., Marzat, J., Defretin, J., Callender, D., Hurlburt, C., Larrey, K., Milakov, M.: The 9th annual MLSP competition: new methods for acoustic classification of multiple simultaneous bird species in a noisy environment. In: 2013 IEEE International Workshop on Machine Learning for Signal Processing (MLSP), pp. 1–8 (2013).
  10. 10.
    Charte, D., Charte, F., García, S., Herrera, F.: A snapshot on nonstandard supervised learning problems: taxonomy, relationships, problem transformations and algorithm adaptations. Prog. Artif. Intell. (2019). (in press)
  11. 11.
    Charte, F., Rivera, A., del Jesus, M., Herrera, F.: Multilabel Classification: Problem Analysis, Metrics and Techniques. Springer, Berlin (2016)Google Scholar
  12. 12.
    Charte, F., Rivera, A.J., Charte, D., del Jesus, M.J., Herrera, F.: Tips, guidelines and tools for managing multi-label datasets: the mldr. datasets R package and the Cometa data repository. Neurocomputing (2018). (In Press)
  13. 13.
    Charte, F., Rivera, A.J., del Jesus, M.J., Herrera, F.: Addressing imbalance in multilabel classification: measures and random resampling algorithms. Neurocomputing 163, 3–16 (2015). (Recent advancements in hybrid artificial intelligence systems and its application to real-world problems progress in intelligent systems mining humanistic data)CrossRefGoogle Scholar
  14. 14.
    Clare, A., King, R.D.: Knowledge discovery in multi-label phenotype data. In: De Raedt, L., Siebes, A. (eds.) Principles of Data Mining and Knowledge Discovery, pp. 42–53. Springer, Berlin (2001)CrossRefGoogle Scholar
  15. 15.
    Demšar, J.: Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res. 7, 1–30 (2006)MathSciNetzbMATHGoogle Scholar
  16. 16.
    Diplaris, S., Tsoumakas, G., Mitkas, P.A., Vlahavas, I.: Protein classification with multiple algorithms. In: Bozanis, P., Houstis, E.N. (eds.) Advances in Informatics, pp. 448–456. Springer, Berlin (2005)CrossRefGoogle Scholar
  17. 17.
    Duygulu, P., Barnard, K., de Freitas, J.F.G., Forsyth, D.A.: Object recognition as machine translation: learning a lexicon for a fixed image vocabulary. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) Computer Vision—ECCV 2002, pp. 97–112. Springer, Berlin (2002)Google Scholar
  18. 18.
    Elisseeff, A. Weston, J.: A kernel method for multi-labelled classification. In: Advances in Neural Information Processing Systems 14, vol. 14, pp. 681–687 (2001).
  19. 19.
    Fürnkranz, J., Hüllermeier, E., Loza Mencía, E., Brinker, K.: Multilabel classification via calibrated label ranking. Mach. Learn. 73, 133–153 (2008). CrossRefGoogle Scholar
  20. 20.
    Ghamrawi, N., McCallum, A.: Collective multi-label classification. In: Proceedings of the 14th ACM International Conference on Information and Knowledge Management, pp. 195–200. ACM (2005).
  21. 21.
    Gibaja, E., Ventura, S.: A tutorial on multilabel learning. ACM Comput. Surv. 47(3), 52:1–52:38 (2015). CrossRefGoogle Scholar
  22. 22.
    Godbole, S., Sarawagi, S.: Discriminative methods for multi-labeled classification. In: Advances in Knowledge Discovery and Data Mining, pp. 22–30. Springer, Berlin (2004).
  23. 23.
    Ioannou, M., Sakkas, G., Tsoumakas, G., Vlahavas, I.: Obtaining bipartitions from score vectors for multi-label classification. In: 2010 22nd IEEE International Conference on Tools with Artificial Intelligence, vol. 1, pp. 409–416 (2010)Google Scholar
  24. 24.
    Katakis, I., Tsoumakas, G., Vlahavas, I.: Multilabel text classification for automated tag suggestion. In: Proceedings of the ECML/PKDD 2008 Discovery Challenge (2008)Google Scholar
  25. 25.
    Klimt, B., Yang, Y.: The enron corpus: a new dataset for email classification research. In: Boulicaut, J.-F., Esposito, F., Giannotti, F., Pedreschi, D. (eds.) Machine Learning: ECML 2004, pp. 217–226. Springer, Berlin (2004)Google Scholar
  26. 26.
    Klir, G.J.: Uncertainty and Information: Foundations of Generalized Information Theory. Wiley, New York (2005). CrossRefzbMATHGoogle Scholar
  27. 27.
    Madjarov, G., Kocev, D., Gjorgjevikj, D., Džeroski, S.: An extensive experimental comparison of methods for multi-label learning. Pattern Recognit. 45(9), 3084–3104 (2012). CrossRefGoogle Scholar
  28. 28.
    Mantas, C.J., Abellán, J.: Credal-C4.5: decision tree based on imprecise probabilities to classify noisy data. Expert Syst. Appl. 41(10), 4625–4637 (2014). CrossRefGoogle Scholar
  29. 29.
    Mantas, C.J., Abellán, J., Castellano, J.G.: Analysis of Credal-C4.5 for classification in noisy domains. Expert Syst. Appl. 61, 314–326 (2016). CrossRefGoogle Scholar
  30. 30.
    McCallum, A. (1999). Multi-label text classification with a mixture model trained by EM. In: AAAI’99 Workshop on Text Learning, pp. 1–7Google Scholar
  31. 31.
    Nasierding, G., Kouzani, A.: Image to text translation by multi-label classification. In: Advanced Intelligent Computing Theories and Applications. With Aspects of Artificial Intelligence, vol. 6216, pp. 247–254. Springer, Berlin (2010).
  32. 32.
    Pestian, J.P., Brew, C., Matykiewicz, P., Hovermale, D.J., Johnson, N., Cohen, K.B., Duch, W.: A shared task involving multi-label classification of clinical free text. In: Proceedings of the Workshop on BioNLP 2007: Biological, Translational, and Clinical Language Processing, pp. 97–104. Association for Computational Linguistics (2007)Google Scholar
  33. 33.
    Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers Inc., San Francisco (1993)Google Scholar
  34. 34.
    R Core Team: R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria (2013).
  35. 35.
    Read, J., Pfahringer, B., Holmes, G., Frank, E.: Classifier chains for multi-label classification. Mach. Learn. 85(3), 333 (2011). MathSciNetCrossRefGoogle Scholar
  36. 36.
    Schapire, R.E., Singer, Y.: Boostexter: a boosting-based system for text categorization. Mach. Learn. 39(2), 135–168 (2000). CrossRefzbMATHGoogle Scholar
  37. 37.
    Shannon, C.E.: A mathematical theory of communication. Bell Syst. Tech. J. 27(3), 379–423 (1948). MathSciNetCrossRefzbMATHGoogle Scholar
  38. 38.
    Snoek, C.G.M., Worring, M., van Gemert, J.C., Geusebroek, J.-M., Smeulders, A.W.M.: The challenge problem for automated detection of 101 semantic concepts in multimedia. In: Proceedings of the 14th ACM International Conference on Multimedia, pp. 421–430. ACM (2006).
  39. 39.
    Sousa, R., Gama, J.: Multi-label classification from high-speed data streams with adaptive model rules and random rules. Prog. Artif. Intell. 7(3), 177–187 (2018). CrossRefGoogle Scholar
  40. 40.
    Trohidis, K., Tsoumakas, G., Kalliris, G., Vlahavas, I.P.: Multi-label classification of music into emotions. In: ISMIR, vol. 8, pp. 325–330 (2008)Google Scholar
  41. 41.
    Tsoumakas, G., Katakis, I., Vlahavas, I.: Effective and efficient multilabel classification in domains with large number of labels. In: Proc. ECML/PKDD 2008 Workshop on Mining Multidimensional Data (MMD’08), pp. 30–44 (2008)Google Scholar
  42. 42.
    Tsoumakas, G., Spyromitros-Xioufis, E., Vilcek, J., Vlahavas, I.: Mulan: a java library for multi-label learning. J. Mach. Learn. Res. 12, 2411–2414 (2011)MathSciNetzbMATHGoogle Scholar
  43. 43.
    Tsoumakas, G. Vlahavas, I.: Random k-labelsets: an ensemble method for multilabel classification. In: European Conference on Machine Learning, pp. 406–417. Springer (2007).
  44. 44.
    Turnbull, D., Barrington, L., Torres, D., Lanckriet, G.: Semantic annotation and retrieval of music and sound effects. IEEE Trans. Audio Speech Lang. Process. 16(2), 467–476 (2008). CrossRefGoogle Scholar
  45. 45.
    Walley, P.: Inferences from multinomial data: learning about a bag of marbles (with discussion). J. R. Stat. Soc. Ser. B (Methodological) 58(1), 3–57 (1996). MathSciNetzbMATHGoogle Scholar
  46. 46.
    Wilcoxon, F.: Individual comparisons by ranking methods. Biom. Bull. 1(6), 80–83 (1945). CrossRefGoogle Scholar
  47. 47.
    Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques, 2nd edn. Morgan Kaufmann Publishers Inc., San Francisco (2005)zbMATHGoogle Scholar
  48. 48.
    Zhang, M.-L., Zhou, Z.-H.: Multilabel neural networks with applications to functional genomics and text categorization. IEEE Trans. Knowl. Data Eng. 18(10), 1338–1351 (2006). CrossRefGoogle Scholar
  49. 49.
    Zhang, M.L., Zhou, Z.H.: A review on multi-label learning algorithms. IEEE Trans. Knowl. Data Eng. 26(8), 1819–1837 (2014). CrossRefGoogle Scholar

Copyright information

© Springer-Verlag GmbH Germany, part of Springer Nature 2019

Authors and Affiliations

  • S. Moral-García
    • 1
  • Carlos J. Mantas
    • 1
    Email author
  • Javier G. Castellano
    • 1
  • Joaquín Abellán
    • 1
  1. 1.Department of Computer Science and Artificial IntelligenceUniversity of GranadaGranadaSpain

Personalised recommendations