Abstract
Existing multi-label learning approaches assume all labels in a dataset are of the same importance. However, the importance of each label is generally different in real world. In this paper, we introduce multi-label importance (MLI) which measures label importance from two perspectives: label predictability and label effects. Specifically, label predictability and label effects can be extracted from training data before building models for multi-label learning. After that, the multi-label importance information can be used in existing approaches to improve the performance of multi-label learning. To prove this, we propose a classifier chain algorithm based on multi-label importance ranking and a improved kNN-based algorithm which takes both feature distance and label distance into consideration. We apply our algorithms on benchmark datasets demonstrating efficient multi-label learning by exploiting multi-label importance. It is also worth mentioning that our experiments show the strong positive correlation between label predictability and label effects.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
Data sets were downloaded from http://mulan.sourceforge.net/datasets.html and http://meka.sourceforge.net/#datasets.
References
Boutell, M.R., Luo, J., Shen, X., Brown, C.M.: Learning multi-label scene classification. Pattern Recognit. 37(9), 1757–1771 (2004)
Singer, Y., Schapire, R.E.: BoosTexter: a boosting-based system for text categorization. Mach. Learn. 39, 135–168 (2000)
Elisseeff, A., Weston, J.: A kernel method for multi-labelled classification. In: Proceedings of NIPS, vol. 14, pp. 681–687 (2001)
Read, J., Pfahringer, B., Holmes, G., Frank, E.: Classifier chains for multi-label classification. Mach. Learn. 85(3), 333 (2011)
Zhang, M.L., Zhou, Z.H.: A review on multi-label learning algorithms. IEEE Trans. Knowl. Data Eng. 26(8), 1819–1837 (2014)
Tsoumakas, G., Katakis, I., Vlahavas, I.: Mining multi-label data. In: Maimon, O., Rokach, L. (eds.) Data Mining and Knowledge Discovery Handbook, pp. 667–685. Springer, Boston (2009)
Brinker, K.: Multilabel classification via calibrated label ranking. Mach. Learn. 73(2), 133–153 (2008)
Tsoumakas, G., Katakis, I., Taniar, D.: Multi-label classification: an overview. Int. J. Data Warehous. Min. 3(3), 1–13 (2007)
Zhang, M.L., Zhou, Z.H.: Ml-KNN: a lazy learning approach to multi-label learning. Pattern Recogn. 40(7), 2038–2048 (2007)
Zhang, M.L., Zhou, Z.H.: Multilabel neural networks with applications to functional genomics and text categorization. IEEE Trans. Knowl. Data Eng. 18(10), 1338–1351 (2006)
Li, Y.K., Zhang, M.L., Geng, X.: Leveraging implicit relative labeling-importance information for effective multi-label learning. In: IEEE International Conference on Data Mining, vol. 6, pp. 251–260. IEEE (2016)
Geng, X., Ji, R.: Label distribution learning. In: IEEE International Conference on Data Mining Workshops, pp. 377–383. IEEE Computer Society (2013)
Hamming, R.W.: Error detecting and error correcting codes. Bell Syst. Tech. J. 29(2), 147–160 (1982)
Plastino, A., Freitas, A.A.: A genetic algorithm for optimizing the label ordering in multi-label classifier chains. In: IEEE International Conference on TOOLS with Artificial Intelligence, pp. 469–476. IEEE Computer Society (2013)
Klimt, B., Yang, Y.: Introducing the Enron Corpus. In: Conference on Email and Anti-Spam. DBLP (2004)
Zhang, L., Zhang, Y., Tang, J., Lu, K., Tian, Q.: Binary code ranking with weighted hamming distance. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), vol. 9, pp. 1586–1593. IEEE (2013)
Wu, X. Z., Zhou, Z. H.: A Unified View of Multi-Label Performance Measures. arXiv preprint arXiv: 1609.00288 (2016)
Trohidis, K., Tsoumakas, G., Kalliris, G., Vlahavas, I.: Multilabel classification of music into emotions. Blood 90(9), 3438–3443 (2008)
Yu, Y., Pedrycz, W., Miao, D.: Multi-label classification by exploiting label correlations. Expert Syst. App. 41(6), 2989–3004 (2014)
Senge, R., del Coz, J.J., Hüllermeier, E.: On the problem of error propagation in classifier chains for multi-label classification. In: Spiliopoulou, M., Schmidt-Thieme, L., Janning, R. (eds.) Data Analysis, Machine Learning and Knowledge Discovery. SCDAKO, pp. 163–170. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-01595-8_18
Hu, F., Xu, X., Wang, J., Yang, Z., Li, L.: Memory-enhanced latent semantic model: short text understanding for sentiment analysis. In: Candan, S., Chen, L., Pedersen, T.B., Chang, L., Hua, W. (eds.) DASFAA 2017. LNCS, vol. 10177, pp. 393–407. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-55753-3_25
Acknowledgement
It was supported by NSF Chongqing China (cstc2017zdcy-zdyf0366).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Wang, D., Li, L., Wang, J., Hu, F., Zhang, X. (2018). Extracting Label Importance Information for Multi-label Classification. In: Pei, J., Manolopoulos, Y., Sadiq, S., Li, J. (eds) Database Systems for Advanced Applications. DASFAA 2018. Lecture Notes in Computer Science(), vol 10828. Springer, Cham. https://doi.org/10.1007/978-3-319-91458-9_26
Download citation
DOI: https://doi.org/10.1007/978-3-319-91458-9_26
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-91457-2
Online ISBN: 978-3-319-91458-9
eBook Packages: Computer ScienceComputer Science (R0)