Abstract
In this paper, the problem of multilabel classification using the classifier chain scheme is addressed. We deal with the problem of building a diverse ensemble of the classifier-chain-based ensemble. For this purpose, we propose a permutation-based criterion of chain diversity. The final ensemble is build using a multi-objective genetic algorithm, which is used to optimise classification quality and chain diversity simultaneously. The proposed methods were evaluated using 29 benchmark datasets. The comparison was performed using four different multi-label evaluation measures. The experimental study reveals that the proposed approach provides a better classification quality than response-based diversity criteria.
Keywords
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Burkhardt, S., Kramer, S.: On the spectrum between binary relevance and classifier chains in multi-label classification. In: Proceedings of the 30th Annual ACM Symposium on Applied Computing - SAC 2015. Association for Computing Machinery (ACM) (2015)
Charte, F., Rivera, A., Jesus, M.J., Herrera, F.: Concurrence among imbalanced labels and its influence on multilabel resampling algorithms. In: Polycarpou, M., Carvalho, A.C.P.L.F., Pan, J.-S., Woźniak, M., Quintian, H., Corchado, E. (eds.) HAIS 2014. LNCS, vol. 8480, pp. 110–121. Springer, Cham (2014). doi:10.1007/978-3-319-07617-1_10
Chekina, L., Gutfreund, D., Kontorovich, A., Rokach, L., Shapira, B.: Exploiting label dependencies for improved sample complexity. Mach. Learn. 91(1), 1–42 (2012)
Czogalla, J., Fink, A.: Fitness landscape analysis for the resource constrained project scheduling problem. In: Stützle, T. (ed.) LION 2009. LNCS, vol. 5851, pp. 104–118. Springer, Heidelberg (2009). doi:10.1007/978-3-642-11169-3_8
D’Ambros, M., Lanza, M., Robbes, R.: An extensive comparison of bug prediction approaches. In: 2010 7th IEEE Working Conference on Mining Software Repositories (MSR 2010). Institute of Electrical & Electronics Engineers (IEEE) (2010). http://dx.doi.org/10.1109/MSR.2010.5463279
Demšar, J.: Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res. 7, 1–30 (2006)
Garcia, S., Herrera, F.: An extension on “statistical comparisons of classifiers over multiple data sets” for all pairwise comparisons. J. Mach. Learn. Res. 9, 2677–2694 (2008)
Gibaja, E., Ventura, S.: Multi-label learning: a review of the state of the art and ongoing research. WIREs Data Min. Knowl. Discov. 4(6), 411–444 (2014)
Gonçalves, E.C., Plastino, A., Freitas, A.A.: Simpler is better. In: Proceedings of the 2015 on Genetic and Evolutionary Computation Conference - GECCO 2015. Association for Computing Machinery (ACM) (2015). http://dx.doi.org/10.1145/2739480.2754650
Hadka, D.: http://moeaframework.org/, http://moeaframework.org/. Accessed 9 Jan 2017
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software. SIGKDD Explor. Newsl. 11(1), 10 (2009)
Heider, D., Senge, R., Cheng, W., Hullermeier, E.: Multilabel classification for exploiting cross-resistance information in HIV-1 drug resistance prediction. Bioinformatics 29(16), 1946–1952 (2013)
Jiang, J.Y., Tsai, S.C., Lee, S.J.: FSKNN: multi-label text categorization based on fuzzy similarity and k nearest neighbors. Expert Syst. Appl. 39(3), 2813–2821 (2012)
Kuncheva, L.I.: Combining Pattern Classifiers: Methods and Algorithms, 1st edn. Wiley, New York (2004)
Li, H., Zhang, Q.: Multiobjective optimization problems with complicated pareto sets, MOEA/d and NSGA-II. IEEE Trans. Evol. Comput. 13(2), 284–302 (2009). http://dx.doi.org/10.1109/TEVC.2008.925798
Luaces, O., Díez, J., Barranquero, J., del Coz, J.J., Bahamonde, A.: Binary relevance efficacy for multilabel classification. Program. Artif. Intell. 1(4), 303–313 (2012)
Peng, Y., Fang, M., Wang, C., Xie, J.: Entropy chain multi-label classifiers for traditional medicine diagnosing parkinson’s disease. In: 2015 IEEE International Conference on Bioinformatics and Biomedicine (BIBM). Institute of Electrical and Electronics Engineers (IEEE), November 2015
Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers Inc., San Francisco (1993)
Read, J., Martino, L., Luengo, D.: Efficient monte carlo methods for multi-dimensional learning with classifier chains. Pattern Recogn. 47(3), 1535–1546 (2014)
Read, J., Martino, L., Olmos, P.M., Luengo, D.: Scalable multi-output label prediction: from classifier chains to classifier trellises. Pattern Recogn. 48(6), 2096–2109 (2015)
Read, J., Peter, R.: Meka: http://meka.sourceforge.net/, http://meka.sourceforge.net/. Accessed 29 Mar 2015
Read, J., Pfahringer, B., Holmes, G., Frank, E.: Classifier chains for multi-label classification. Mach. Learn. 85(3), 333–359 (2011)
Sanden, C., Zhang, J.Z.: Enhancing multi-label music genre classification through ensemble techniques. In: Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2011, NY, USA, pp. 705–714. ACM, New York (2011)
Sucar, L.E., Bielza, C., Morales, E.F., Hernandez-Leal, P., Zaragoza, J.H., Larrañaga, P.: Multi-label classification with bayesian network-based chain classifiers. Pattern Recogn. Lett. 41, 14–22 (2014)
Tomás, J.T., Spolaôr, N., Cherman, E.A., Monard, M.C.: A framework to generate synthetic multi-label datasets. Electron. Notes Theor. Comput. Sci. 302, 155–176 (2014). http://dx.doi.org/10.1016/j.entcs.2014.01.025
Tsoumakas, G., Katakis, I.: Multi-label classification: an overview. Int. J. Data Warehouse. Min. (IJDWM) 3(3), 1–13 (2007)
Tsoumakas, G., Spyromitros-Xioufis, E., Vilcek, J., Vlahavas, I.: Mulan: a java library for multi-label learning. J. Mach. Learn. Res. 12, 2411–2414 (2011). http://dl.acm.org/citation.cfm?id=1953048.2021078
Wu, J.S., Huang, S.J., Zhou, Z.H.: Genome-wide protein function prediction through multi-instance multi-label learning. IEEE/ACM Trans. Comput. Biol. Bioinform. 11(5), 891–902 (2014)
Xu, J.: Fast multi-label core vector machine. Pattern Recogn. 46(3), 885–898 (2013)
Zhang, M.L., Zhou, Z.H.: Multilabel neural networks with applications to functional genomics and text categorization. IEEE Trans. Knowl. Data Eng. 18(10), 1338–1351 (2006)
Zhang, M.L., Zhou, Z.H.: ML-KNN: a lazy learning approach to multi-label learning. Pattern Recogn. 40(7), 2038–2048 (2007)
Zhou, Z.H., Zhang, M.L.: Multi-instance multilabel learning with application to scene classification. In: Advances in Neural Information Processing Systems 19 (2007)
Zhou, Z.H., Zhang, M.L., Huang, S.J., Li, Y.F.: Multi-instance multi-label learning. Artif. Intell. 176(1), 2291–2320 (2012)
Acknowledgement
This work was supported by the statutory funds of the Department of Systems and Computer Networks, Wroclaw University of Science and Technology. Computational resources were provided by PL-Grid Infrastructure.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG
About this paper
Cite this paper
Trajdos, P., Kurzynski, M. (2018). Permutation-Based Diversity Measure for Classifier-Chain Approach. In: Kurzynski, M., Wozniak, M., Burduk, R. (eds) Proceedings of the 10th International Conference on Computer Recognition Systems CORES 2017. CORES 2017. Advances in Intelligent Systems and Computing, vol 578. Springer, Cham. https://doi.org/10.1007/978-3-319-59162-9_43
Download citation
DOI: https://doi.org/10.1007/978-3-319-59162-9_43
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-59161-2
Online ISBN: 978-3-319-59162-9
eBook Packages: EngineeringEngineering (R0)