Dynamic Classifier Chain with Random Decision Trees

  • Moritz Kulessa
  • Eneldo Loza MencíaEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11198)


Classifiers chains (CC) is an effective approach in order to exploit label dependencies in multi-label data. However, it has the disadvantages that the chain is chosen at total random or relies on a pre-specified ordering of the labels which is expensive to compute. Moreover, the same ordering is used for every test instance, ignoring the fact that different orderings might be best suited for different test instances. We propose a new approach based on random decision trees (RDT) which can choose the label ordering for each prediction dynamically depending on the respective test instance. RDT are not adapted to a specific learning task, but in contrast allow to define a prediction objective on the fly during test time, thus offering a perfect test bed for directly comparing different prediction schemes. Indeed, we show that dynamically selecting the next label improves over using a static ordering of the labels under an otherwise unchanged RDT model and experimental environment.


Multi-label classification Random decision trees Classifier chains 


  1. 1.
    Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)CrossRefGoogle Scholar
  2. 2.
    Dembczyński, K., Cheng, W., Hüllermeier, E.: Bayes optimal multilabel classification via probabilistic classifier chains. In: Proceedings of the 27th International Conference on International Conference on Machine Learning, pp. 279–286 (2010)Google Scholar
  3. 3.
    Dembczyński, K., Waegeman, W., Cheng, W., Hüllermeier, E.: On label dependence and loss minimization in multi-label classification. Mach. Learn. 88(1–2), 5–45 (2012)MathSciNetCrossRefGoogle Scholar
  4. 4.
    Fan, W.: On the Optimality of probability estimation by random decision trees. In: Proceedings of the 19th National Conference on Artificial Intelligence, pp. 336–341 (2004)Google Scholar
  5. 5.
    Fan, W., Greengrass, E., McCloskey, J., Yu, P.S., Drammey, K.: Effective estimation of posterior probabilities: explaining the accuracy of randomized decision tree approaches. In: Proceedings of the 5th International Conference on Data Mining, pp. 154–161 (2005)Google Scholar
  6. 6.
    Fan, W., Wang, H., Yu, P.S., Ma, S.: Is random model better? On its accuracy and efficiency. In: Proceedings of the 3rd IEEE International Conference on Data Mining, pp. 51–58 (2003)Google Scholar
  7. 7.
    Geurts, P., Ernst, D., Wehenkel, L.: Extremely randomized trees. Mach. Learn. 63(1), 3–42 (2006)CrossRefGoogle Scholar
  8. 8.
    Goncalves, E.C., Plastino, A., Freitas, A.A.: A genetic algorithm for optimizing the label ordering in multi-label classifier chains. In: Proceedings of the IEEE 25th International Conference on Tools with Artificial Intelligence, pp. 469–476 (2013)Google Scholar
  9. 9.
    Kong, X., Yu, P.S.: An ensemble-based approach to fast classification of multi-label data streams. In: Proceedings of the 7th International Conference on Collaborative Computing: Networking, Applications and Worksharing, pp. 95–104 (October 2011)Google Scholar
  10. 10.
    Kumar, A., Vembu, S., Menon, A.K., Elkan, C.: Beam search algorithms for multilabel learning. Mach. Learn. 92(1), 65–89 (2013)MathSciNetCrossRefGoogle Scholar
  11. 11.
    Li, N., Zhou, Z.-H.: Selective ensemble of classifier chains. In: Zhou, Z.-H., Roli, F., Kittler, J. (eds.) MCS 2013. LNCS, vol. 7872, pp. 146–156. Springer, Heidelberg (2013). Scholar
  12. 12.
    Malerba, D., Semeraro, G., Esposito, F.: A multistrategy approach to learning multiple dependent concepts. Mach. Learn. Stat. Interface chap. 4, 87–106 (1997)Google Scholar
  13. 13.
    Mena, D., Montañés, E., Quevedo, J.R., Coz, J.J.d.: Using A* for inference in probabilistic classifier chains. In: Proceedings of the 24th International Conference on Artificial Intelligence, pp. 3707–3713 (2015)Google Scholar
  14. 14.
    Mena, D., Montañés, E., Quevedo, J.R., Coz, J.J.: An overview of inference methods in probabilistic classifier chains for multilabel classification. Wiley Interdiscip. Rev. Data Min. Knowl. Discov. 6(6), 215–230 (2016)CrossRefGoogle Scholar
  15. 15.
    Nam, J., Loza Mencía, E., Kim, H.J., Fürnkranz, J.: Maximizing subset accuracy with recurrent neural networks in multi-label classification. In: Advances in Neural Information Processing Systems 30 (NIPS-17). pp. 5419–5429 (2017)Google Scholar
  16. 16.
    Quevedo, J.R., Luaces, O., Bahamonde, A.: Multilabel classifiers with a probabilistic thresholding strategy. Pattern Recognit. 45(2), 876–883 (2012)zbMATHGoogle Scholar
  17. 17.
    Read, J., Martino, L., Luengo, D.: Efficient Monte Carlo methods for multi-dimensional learning with classifier chains. Pattern Recognit. 47(3), 1535–1546 (2014)CrossRefGoogle Scholar
  18. 18.
    Read, J., Pfahringer, B., Holmes, G., Frank, E.: Classifier chains for multi-label classification. Mach. Learn. 85(3), 333–359 (2011)MathSciNetCrossRefGoogle Scholar
  19. 19.
    Senge, R., del Coz, J.J., Hüllermeier, E.: On the problem of error propagation in classifier chains for multi-label classification. In: Spiliopoulou, M., Schmidt-Thieme, L., Janning, R. (eds.) Data Analysis, Machine Learning and Knowledge Discovery. SCDAKO, pp. 163–170. Springer, Cham (2014). Scholar
  20. 20.
    da Silva, P.N., Gonçalves, E.C., Plastino, A., Freitas, A.A.: Distinct chains for different instances: an effective strategy for multi-label classifier chains. In: Calders, T., Esposito, F., Hüllermeier, E., Meo, R. (eds.) ECML PKDD 2014. LNCS (LNAI), vol. 8725, pp. 453–468. Springer, Heidelberg (2014). Scholar
  21. 21.
    Sucar, L.E., Bielza, C., Morales, E.F., Hernandez-Leal, P., Zaragoza, J.H., Larrañaga, P.: Multi-label classification with Bayesian network-based chain classifiers. Pattern Recognit. Lett. 41, 14–22 (2014)CrossRefGoogle Scholar
  22. 22.
    Tsoumakas, G., Katakis, I., Vlahavas, I.: Mining Multi-label data. Data Mining and Knowledge Discovery Handbook, pp. 667–685 (2010)CrossRefGoogle Scholar
  23. 23.
    Tsoumakas, G., Spyromitros-Xioufis, E., Vilcek, J., Vlahavas, I.: MULAN: a java library for multi-label learning. J. Mach. Learn. Res. 12, 2411–2414 (2011)MathSciNetzbMATHGoogle Scholar
  24. 24.
    Vens, C., Struyf, J., Schietgat, L., Džeroski, S., Blockeel, H.: Decision trees for hierarchical multi-label classification. Mach. Learn. 73(2), 185 (2008)CrossRefGoogle Scholar
  25. 25.
    Zhang, X., Fan, W., Du, N.: Random decision hashing for massive data learning. In: Proceedings of the 4th International Workshop on Big Data, Streams and Heterogeneous Source Mining: Algorithms, Systems, Programming Models and Applications, pp. 65–80 (2015)Google Scholar
  26. 26.
    Zhang, X., Yuan, Q., Zhao, S., Fan, W., Zheng, W., Wang, Z.: Multi-label classification without the multi-label cost. In: Proceedings of the Society for Industrial and Applied Mathematics International Conference on Data Mining, pp. 778–789 (2010)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  1. 1.Knowledge Engineering GroupTechnische Universtität DarmstadtDarmstadtGermany

Personalised recommendations