Supervised representation learning for multi-label classification

  • Ming Huang
  • Fuzhen ZhuangEmail author
  • Xiao Zhang
  • Xiang Ao
  • Zhengyu Niu
  • Min-Ling Zhang
  • Qing He
Part of the following topical collections:
  1. Special Issue of the ACML 2018 Journal Track
  2. Special Issue of the ACML 2018 Journal Track
  3. Special Issue of the ACML 2018 Journal Track


Representation learning is one of the most important aspects of multi-label learning because of the intricate nature of multi-label data. Current research on representation learning either fails to consider label knowledge or is affected by the lack of labeled data. Moreover, most of them learn the representations and incorporate the label information in a two-step manner. In this paper, due to the success of representation learning by deep learning we propose a novel framework based on neural networks named SERL to learn global feature representation by jointly considering all labels in an effective supervised manner. At its core, a two-encoding-layer autoencoder, which can utilize labeled and unlabeled data, is adopted to learn feature representation in the supervision of softmax regression. Specifically, the softmax regression incorporates label knowledge to improve the performance of both representation learning and multi-label learning by being jointly optimized with the autoencoder. Moreover, the autoencoder is expanded into two encoding layers to share knowledge with the softmax regression by sharing the second encoding weight matrix. We conduct extensive experiments on five real-world datasets to demonstrate the superiority of SERL over other state-of-the-art multi-label learning approaches.


Representation learning Multi-label learning Two-encoding-layer autoencoder 



The research work is supported by the National Key Research and Development Program of China under Grant No. 2018YFB1004300, the National Natural Science Foundation of China under Grant Nos. 61773361, U1836206, U1811461, the Project of Youth Innovation Promotion Association CAS under Grant No. 2017146.


  1. Boutell, M. R., Luo, J., Shen, X., & Brown, C. M. (2004). Learning multi-label scene classification. Pattern Recognition, 37(9), 1757–1771.Google Scholar
  2. Chang, C. C., & Lin, C. J. (2011). Libsvm: A library for support vector machines. ACM Transactions on Intelligent Systems & Technology, 2(3), 27.Google Scholar
  3. Chen, G., Song, Y., Wang, F., & Zhang, C. (2008). Semi-supervised multi-label learning by solving a sylvester equation. In SIAM international conference on data mining, SDM 2008, April 24–26, 2008, Atlanta, Georgia, USA (pp. 410–419).Google Scholar
  4. Clare, A., & King, R. D. (2002). Knowledge discovery in multi-label phenotype data. Lecture Notes in Computer Science, 2168(2168), 42–53.zbMATHGoogle Scholar
  5. Elisseeff, A. E., & Weston, J. (2002). A kernel method for multi-labelled classification. Advances in Neural Information Processing Systems, 14, 681–687.Google Scholar
  6. Fürnkranz, J., Hüllermeier, E., Mencía, E. L., & Brinker, K. (2008). Multilabel classification via calibrated label ranking. Machine Learning, 73(2), 133–153.Google Scholar
  7. Ghamrawi, N., & McCallum, A. (2005). Collective multi-label classification. In Proceedings of the 14th ACM international conference on information and knowledge management (pp. 195–200). ACM.Google Scholar
  8. Godbole, S., & Sarawagi, S. (2004). Discriminative methods for multi-labeled classification. In Pacific-Asia conference on knowledge discovery and data mining (pp. 22–30). Springer.Google Scholar
  9. Ji, S., Tang, L., Yu, S., & Ye, J. (2010). A shared-subspace learning framework for multi-label classification. Acm Transactions on Knowledge Discovery from Data, 4(2), 1–29.Google Scholar
  10. Karalas, K., Tsagkatakis, G., Zervakis, M., & Tsakalides, P. (2015). Deep learning for multi-label land cover classification. In SPIE remote sensing (pp. 96430Q–96430Q). International Society for Optics and Photonics.Google Scholar
  11. Li, X., & Guo, Y. (2014). Bi-directional representation learning for multi-label classification. In Joint European conference on machine learning and knowledge discovery in databases (pp. 209–224). Springer.Google Scholar
  12. Madigan, D., Genkin, A., Lewis, D. D., Argamon, S., Fradkin, D., Li, Y., & Consulting, D. D. L. (2005). Author identification on the large scale. Proceedings of the meeting of the classification society of North America.Google Scholar
  13. Nguyen, V., Gupta, S., Rana, S., Li, C., & Venkatesh, S. (2016). A Bayesian nonparametric approach for multi-label classification. In Asian conference on machine learning (pp. 254–269).Google Scholar
  14. Qian, B., & Davidson, I. (2010). Semi-supervised dimension reduction for multi-label classification. In AAAI (Vol. 10, pp. 569–574).Google Scholar
  15. Read, J., & Perezcruz, F. (2014). Deep learning for multi-label classification. Machine Learning, 85(3), 333–359.Google Scholar
  16. Read, J., Pfahringer, B., Holmes, G., & Frank, E. (2011). Classifier chains for multi-label classification. Machine Learning, 85(3), 333.MathSciNetGoogle Scholar
  17. Read, J., Reutemann, P., Pfahringer, B., & Holmes, G. (2016). Meka: A multi-label/multi-target extension to weka. Journal of Machine Learning Research, 17(21), 1–5.MathSciNetzbMATHGoogle Scholar
  18. Schapire, R. E., & Singer, Y. (2000). Boostexter: A boosting-based system for text categorization. Machine Learning, 39(2–3), 135–168.zbMATHGoogle Scholar
  19. Sun, L., Ji, S., & Ye, J. (2008). Hypergraph spectral learning for multi-label classification. In ACM SIGKDD international conference on knowledge discovery and data mining, Las Vegas, Nevada, USA (pp. 668–676).Google Scholar
  20. Tsoumakas, G., & Katakis, I. (2006). Multi-label classification: An overview. International Journal of Data Warehousing and Mining, 3(3), 1–13.Google Scholar
  21. Tsoumakas, G., Katakis, I., & Vlahavas, I. (2009). Mining multi-label data. Data mining and knowledge discovery handbook (pp. 667–685). Boston, MA: Springer.Google Scholar
  22. Tsoumakas, G., Katakis, I., & Vlahavas, I. (2011). Random k-labelsets for multilabel classification. IEEE Transactions on Knowledge and Data Engineering, 23(7), 1079–1089.Google Scholar
  23. Tsoumakas, G., Spyromitros-Xioufis, E., Vilcek, J., & Vlahavas, I. (2011). Mulan: A java library for multi-label learning. Journal of Machine Learning Research, 12, 2411–2414.MathSciNetzbMATHGoogle Scholar
  24. Tsoumakas, G., & Vlahavas, I. (2007). Random k-labelsets: An ensemble method for multilabel classification. In European conference on machine learning (pp. 406–417). Springer.Google Scholar
  25. Vincent, P., Larochelle, H., Lajoie, I., Bengio, Y., & Manzagol, P.-A. (2010). Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion. Journal of Machine Learning Research, 11, 3371–3408.MathSciNetzbMATHGoogle Scholar
  26. Yu, K., Yu, S., & Tresp, V. (2005). Multi-label informed latent semantic indexing. In International ACM SIGIR conference on research and development in information retrieval (pp. 258–265).Google Scholar
  27. Zhang, M.-L., & Wu, L. (2015). Lift: Multi-label learning with label-specific features. IEEE Transactions on Pattern Analysis and Machine Intelligence, 37(1), 107–120.Google Scholar
  28. Zhang, M. -L., & Zhang, K. (2010). Multi-label learning by exploiting label dependency. In Proceedings of the 16th ACM SIGKDD international conference on knowledge discovery and data mining (pp. 999–1008). ACM.Google Scholar
  29. Zhang, M. L., & Zhou, Z. H. (2005). A k-nearest neighbor based algorithm for multi-label classification. In IEEE international conference on granular computing (Vol. 2, pp. 718–721).Google Scholar
  30. Zhang, M. L., & Zhou, Z. H. (2006). Multilabel neural networks with applications to functional genomics and text categorization. IEEE Transactions on Knowledge & Data Engineering, 18(10), 1338–1351.Google Scholar
  31. Zhang, M.-L., & Zhou, Z.-H. (2014). A review on multi-label learning algorithms. IEEE Transactions on Knowledge and Data Engineering, 26(8), 1819–1837.Google Scholar
  32. Zhang, Q.-W., Zhong, Y., & Zhang, M.-L. (2018). Feature-induced labeling information enrichment for multi-label learning. In Thirty-Second AAAI Conference on Artificial Intelligence. Google Scholar
  33. Zhang, Y., & Zhou, Z. H. (2008). Multi-label dimensionality reduction via dependence maximization. In AAAI conference on artificial intelligence, AAAI 2008, Chicago, Illinois, USA (pp. 1503–1505).Google Scholar
  34. Zhou, W.-J., Yu, Y., & Zhang, M.-L. (2017). Binary linear compression for multi-label classification. In Proceedings of the 26th international joint conference on artificial intelligence (pp. 3546–3552). AAAI Press.Google Scholar

Copyright information

© The Author(s), under exclusive licence to Springer Science+Business Media LLC, part of Springer Nature 2019

Authors and Affiliations

  1. 1.Key Lab of Intelligent Information Processing of Chinese Academy of Sciences (CAS), Institute of Computing TechnologyCASBeijingChina
  2. 2.University of Chinese Academy of SciencesBeijingChina
  3. 3.State Key Laboratory for Novel Software TechnologyNanjing UniversityNanjingChina
  4. 4.Baidu Inc.BeijingChina
  5. 5.South East UniversityNanjingChina

Personalised recommendations