Skip to main content

Extracting Label Importance Information for Multi-label Classification

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10828))

Abstract

Existing multi-label learning approaches assume all labels in a dataset are of the same importance. However, the importance of each label is generally different in real world. In this paper, we introduce multi-label importance (MLI) which measures label importance from two perspectives: label predictability and label effects. Specifically, label predictability and label effects can be extracted from training data before building models for multi-label learning. After that, the multi-label importance information can be used in existing approaches to improve the performance of multi-label learning. To prove this, we propose a classifier chain algorithm based on multi-label importance ranking and a improved kNN-based algorithm which takes both feature distance and label distance into consideration. We apply our algorithms on benchmark datasets demonstrating efficient multi-label learning by exploiting multi-label importance. It is also worth mentioning that our experiments show the strong positive correlation between label predictability and label effects.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    Data sets were downloaded from http://mulan.sourceforge.net/datasets.html and http://meka.sourceforge.net/#datasets.

References

  1. Boutell, M.R., Luo, J., Shen, X., Brown, C.M.: Learning multi-label scene classification. Pattern Recognit. 37(9), 1757–1771 (2004)

    Article  Google Scholar 

  2. Singer, Y., Schapire, R.E.: BoosTexter: a boosting-based system for text categorization. Mach. Learn. 39, 135–168 (2000)

    Article  Google Scholar 

  3. Elisseeff, A., Weston, J.: A kernel method for multi-labelled classification. In: Proceedings of NIPS, vol. 14, pp. 681–687 (2001)

    Google Scholar 

  4. Read, J., Pfahringer, B., Holmes, G., Frank, E.: Classifier chains for multi-label classification. Mach. Learn. 85(3), 333 (2011)

    Article  MathSciNet  Google Scholar 

  5. Zhang, M.L., Zhou, Z.H.: A review on multi-label learning algorithms. IEEE Trans. Knowl. Data Eng. 26(8), 1819–1837 (2014)

    Article  Google Scholar 

  6. Tsoumakas, G., Katakis, I., Vlahavas, I.: Mining multi-label data. In: Maimon, O., Rokach, L. (eds.) Data Mining and Knowledge Discovery Handbook, pp. 667–685. Springer, Boston (2009)

    Chapter  Google Scholar 

  7. Brinker, K.: Multilabel classification via calibrated label ranking. Mach. Learn. 73(2), 133–153 (2008)

    Article  Google Scholar 

  8. Tsoumakas, G., Katakis, I., Taniar, D.: Multi-label classification: an overview. Int. J. Data Warehous. Min. 3(3), 1–13 (2007)

    Article  Google Scholar 

  9. Zhang, M.L., Zhou, Z.H.: Ml-KNN: a lazy learning approach to multi-label learning. Pattern Recogn. 40(7), 2038–2048 (2007)

    Article  Google Scholar 

  10. Zhang, M.L., Zhou, Z.H.: Multilabel neural networks with applications to functional genomics and text categorization. IEEE Trans. Knowl. Data Eng. 18(10), 1338–1351 (2006)

    Article  Google Scholar 

  11. Li, Y.K., Zhang, M.L., Geng, X.: Leveraging implicit relative labeling-importance information for effective multi-label learning. In: IEEE International Conference on Data Mining, vol. 6, pp. 251–260. IEEE (2016)

    Google Scholar 

  12. Geng, X., Ji, R.: Label distribution learning. In: IEEE International Conference on Data Mining Workshops, pp. 377–383. IEEE Computer Society (2013)

    Google Scholar 

  13. Hamming, R.W.: Error detecting and error correcting codes. Bell Syst. Tech. J. 29(2), 147–160 (1982)

    Article  MathSciNet  Google Scholar 

  14. Plastino, A., Freitas, A.A.: A genetic algorithm for optimizing the label ordering in multi-label classifier chains. In: IEEE International Conference on TOOLS with Artificial Intelligence, pp. 469–476. IEEE Computer Society (2013)

    Google Scholar 

  15. Klimt, B., Yang, Y.: Introducing the Enron Corpus. In: Conference on Email and Anti-Spam. DBLP (2004)

    Google Scholar 

  16. Zhang, L., Zhang, Y., Tang, J., Lu, K., Tian, Q.: Binary code ranking with weighted hamming distance. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), vol. 9, pp. 1586–1593. IEEE (2013)

    Google Scholar 

  17. Wu, X. Z., Zhou, Z. H.: A Unified View of Multi-Label Performance Measures. arXiv preprint arXiv: 1609.00288 (2016)

  18. Trohidis, K., Tsoumakas, G., Kalliris, G., Vlahavas, I.: Multilabel classification of music into emotions. Blood 90(9), 3438–3443 (2008)

    Google Scholar 

  19. Yu, Y., Pedrycz, W., Miao, D.: Multi-label classification by exploiting label correlations. Expert Syst. App. 41(6), 2989–3004 (2014)

    Article  Google Scholar 

  20. Senge, R., del Coz, J.J., Hüllermeier, E.: On the problem of error propagation in classifier chains for multi-label classification. In: Spiliopoulou, M., Schmidt-Thieme, L., Janning, R. (eds.) Data Analysis, Machine Learning and Knowledge Discovery. SCDAKO, pp. 163–170. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-01595-8_18

    Chapter  Google Scholar 

  21. Hu, F., Xu, X., Wang, J., Yang, Z., Li, L.: Memory-enhanced latent semantic model: short text understanding for sentiment analysis. In: Candan, S., Chen, L., Pedersen, T.B., Chang, L., Hua, W. (eds.) DASFAA 2017. LNCS, vol. 10177, pp. 393–407. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-55753-3_25

    Chapter  Google Scholar 

Download references

Acknowledgement

It was supported by NSF Chongqing China (cstc2017zdcy-zdyf0366).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Li Li .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG, part of Springer Nature

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Wang, D., Li, L., Wang, J., Hu, F., Zhang, X. (2018). Extracting Label Importance Information for Multi-label Classification. In: Pei, J., Manolopoulos, Y., Sadiq, S., Li, J. (eds) Database Systems for Advanced Applications. DASFAA 2018. Lecture Notes in Computer Science(), vol 10828. Springer, Cham. https://doi.org/10.1007/978-3-319-91458-9_26

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-91458-9_26

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-91457-2

  • Online ISBN: 978-3-319-91458-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics