Extracting Label Importance Information for Multi-label Classification

Wang, Dengbao; Li, Li; Wang, Jingyuan; Hu, Fei; Zhang, Xiuzhen

doi:10.1007/978-3-319-91458-9_26

Extracting Label Importance Information for Multi-label Classification

Dengbao Wang²⁴,
Li Li²⁴,
Jingyuan Wang²⁴,
Fei Hu²⁴ &
…
Xiuzhen Zhang²⁵

Conference paper
First Online: 12 May 2018

3749 Accesses
1 Citations

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10828))

Abstract

Existing multi-label learning approaches assume all labels in a dataset are of the same importance. However, the importance of each label is generally different in real world. In this paper, we introduce multi-label importance (MLI) which measures label importance from two perspectives: label predictability and label effects. Specifically, label predictability and label effects can be extracted from training data before building models for multi-label learning. After that, the multi-label importance information can be used in existing approaches to improve the performance of multi-label learning. To prove this, we propose a classifier chain algorithm based on multi-label importance ranking and a improved kNN-based algorithm which takes both feature distance and label distance into consideration. We apply our algorithms on benchmark datasets demonstrating efficient multi-label learning by exploiting multi-label importance. It is also worth mentioning that our experiments show the strong positive correlation between label predictability and label effects.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

1.
Data sets were downloaded from http://mulan.sourceforge.net/datasets.html and http://meka.sourceforge.net/#datasets.

References

Boutell, M.R., Luo, J., Shen, X., Brown, C.M.: Learning multi-label scene classification. Pattern Recognit. 37(9), 1757–1771 (2004)
Article Google Scholar
Singer, Y., Schapire, R.E.: BoosTexter: a boosting-based system for text categorization. Mach. Learn. 39, 135–168 (2000)
Article Google Scholar
Elisseeff, A., Weston, J.: A kernel method for multi-labelled classification. In: Proceedings of NIPS, vol. 14, pp. 681–687 (2001)
Google Scholar
Read, J., Pfahringer, B., Holmes, G., Frank, E.: Classifier chains for multi-label classification. Mach. Learn. 85(3), 333 (2011)
Article MathSciNet Google Scholar
Zhang, M.L., Zhou, Z.H.: A review on multi-label learning algorithms. IEEE Trans. Knowl. Data Eng. 26(8), 1819–1837 (2014)
Article Google Scholar
Tsoumakas, G., Katakis, I., Vlahavas, I.: Mining multi-label data. In: Maimon, O., Rokach, L. (eds.) Data Mining and Knowledge Discovery Handbook, pp. 667–685. Springer, Boston (2009)
Chapter Google Scholar
Brinker, K.: Multilabel classification via calibrated label ranking. Mach. Learn. 73(2), 133–153 (2008)
Article Google Scholar
Tsoumakas, G., Katakis, I., Taniar, D.: Multi-label classification: an overview. Int. J. Data Warehous. Min. 3(3), 1–13 (2007)
Article Google Scholar
Zhang, M.L., Zhou, Z.H.: Ml-KNN: a lazy learning approach to multi-label learning. Pattern Recogn. 40(7), 2038–2048 (2007)
Article Google Scholar
Zhang, M.L., Zhou, Z.H.: Multilabel neural networks with applications to functional genomics and text categorization. IEEE Trans. Knowl. Data Eng. 18(10), 1338–1351 (2006)
Article Google Scholar
Li, Y.K., Zhang, M.L., Geng, X.: Leveraging implicit relative labeling-importance information for effective multi-label learning. In: IEEE International Conference on Data Mining, vol. 6, pp. 251–260. IEEE (2016)
Google Scholar
Geng, X., Ji, R.: Label distribution learning. In: IEEE International Conference on Data Mining Workshops, pp. 377–383. IEEE Computer Society (2013)
Google Scholar
Hamming, R.W.: Error detecting and error correcting codes. Bell Syst. Tech. J. 29(2), 147–160 (1982)
Article MathSciNet Google Scholar
Plastino, A., Freitas, A.A.: A genetic algorithm for optimizing the label ordering in multi-label classifier chains. In: IEEE International Conference on TOOLS with Artificial Intelligence, pp. 469–476. IEEE Computer Society (2013)
Google Scholar
Klimt, B., Yang, Y.: Introducing the Enron Corpus. In: Conference on Email and Anti-Spam. DBLP (2004)
Google Scholar
Zhang, L., Zhang, Y., Tang, J., Lu, K., Tian, Q.: Binary code ranking with weighted hamming distance. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), vol. 9, pp. 1586–1593. IEEE (2013)
Google Scholar
Wu, X. Z., Zhou, Z. H.: A Unified View of Multi-Label Performance Measures. arXiv preprint arXiv: 1609.00288 (2016)
Trohidis, K., Tsoumakas, G., Kalliris, G., Vlahavas, I.: Multilabel classification of music into emotions. Blood 90(9), 3438–3443 (2008)
Google Scholar
Yu, Y., Pedrycz, W., Miao, D.: Multi-label classification by exploiting label correlations. Expert Syst. App. 41(6), 2989–3004 (2014)
Article Google Scholar
Senge, R., del Coz, J.J., Hüllermeier, E.: On the problem of error propagation in classifier chains for multi-label classification. In: Spiliopoulou, M., Schmidt-Thieme, L., Janning, R. (eds.) Data Analysis, Machine Learning and Knowledge Discovery. SCDAKO, pp. 163–170. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-01595-8_18
Chapter Google Scholar
Hu, F., Xu, X., Wang, J., Yang, Z., Li, L.: Memory-enhanced latent semantic model: short text understanding for sentiment analysis. In: Candan, S., Chen, L., Pedersen, T.B., Chang, L., Hua, W. (eds.) DASFAA 2017. LNCS, vol. 10177, pp. 393–407. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-55753-3_25
Chapter Google Scholar

Download references

Acknowledgement

It was supported by NSF Chongqing China (cstc2017zdcy-zdyf0366).

Author information

Authors and Affiliations

College of Computer and Information Science, Southwest University, Chongqing, China
Dengbao Wang, Li Li, Jingyuan Wang & Fei Hu
School of Computer Science and Information Technology, RMIT University, Melbourne, Australia
Xiuzhen Zhang

Authors

Dengbao Wang
View author publications
You can also search for this author in PubMed Google Scholar
Li Li
View author publications
You can also search for this author in PubMed Google Scholar
Jingyuan Wang
View author publications
You can also search for this author in PubMed Google Scholar
Fei Hu
View author publications
You can also search for this author in PubMed Google Scholar
Xiuzhen Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Li Li .

Editor information

Editors and Affiliations

Simon Fraser University, Burnaby, BC, Canada
Jian Pei
Aristotle University of Thessaloniki, Thessaloniki, Greece
Yannis Manolopoulos
University of Queensland, Brisbane, QLD, Australia
Shazia Sadiq
University of Western Australia, Crawley, WA, Australia
Jianxin Li

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wang, D., Li, L., Wang, J., Hu, F., Zhang, X. (2018). Extracting Label Importance Information for Multi-label Classification. In: Pei, J., Manolopoulos, Y., Sadiq, S., Li, J. (eds) Database Systems for Advanced Applications. DASFAA 2018. Lecture Notes in Computer Science(), vol 10828. Springer, Cham. https://doi.org/10.1007/978-3-319-91458-9_26

Download citation

DOI: https://doi.org/10.1007/978-3-319-91458-9_26
Published: 12 May 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-91457-2
Online ISBN: 978-3-319-91458-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics