Machine Learning

, Volume 107, Issue 4, pp 703–725 | Cite as

Learning safe multi-label prediction for weakly labeled data

  • Tong Wei
  • Lan-Zhe Guo
  • Yu-Feng Li
  • Wei Gao


In this paper we study multi-label learning with weakly labeled data, i.e., labels of training examples are incomplete, which commonly occurs in real applications, e.g., image classification, document categorization. This setting includes, e.g., (i) semi-supervised multi-label learning where completely labeled examples are partially known; (ii) weak label learning where relevant labels of examples are partially known; (iii) extended weak label learning where relevant and irrelevant labels of examples are partially known. Previous studies often expect that the learning method with the use of weakly labeled data will improve the performance, as more data are employed. This, however, is not always the cases in reality, i.e., weakly labeled data may sometimes degenerate the learning performance. It is desirable to learn safe multi-label prediction that will not hurt performance when weakly labeled data is involved in the learning procedure. In this work we optimize multi-label evaluation metrics (\(\hbox {F}_1\) score and Top-k precision) given that the ground-truth label assignment is realized by a convex combination of base multi-label learners. To cope with the infinite number of possible ground-truth label assignments, cutting-plane strategy is adopted to iteratively generate the most helpful label assignments. The whole optimization is cast as a series of simple linear programs in an efficient manner. Extensive experiments on three weakly labeled learning tasks, namely, (i) semi-supervised multi-label learning; (ii) weak label learning and (iii) extended weak label learning, clearly show that our proposal improves the safeness of using weakly labeled data compared with many state-of-the-art methods.


Multi-label learning Weakly labeled data Safe Evaluation metric 



The authors want to thank Dr. Xiangnan Kong (Worcester P Polytechnic Institute) for constructive and valuable suggestions. The authors want to thank the associate editor and reviewers for helpful comments and suggestions. This research was partially supported by the National Science Foundation of China (61772262, 61403186), the National Key Research and Development Program of China (2017YFB1001900), MSRA Collaborative Research Fund and the Jiangsu Science Foundation (BK20150586).


  1. Balsubramani, A., & Freund, Y. (2015). Optimally combining classifiers using unlabeled data. In Proceedings of the 28th conference on learning theory, Paris, France (pp. 211–225).Google Scholar
  2. Bucak, S. S., Jin, R., & Jain, A. K. (2011). Multi-label learning with incomplete class assignments. In Proceedings of the 24th IEEE conference on computer vision and pattern recognition, Colorado Springs, CO (pp. 2801–2808).Google Scholar
  3. Carneiro, G., Chan, A. B., Moreno, P. J., & Vasconcelos, N. (2007). Supervised learning of semantic classes for image annotation and retrieval. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(3), 394–410.CrossRefGoogle Scholar
  4. Chang, C. C., & Lin, C. J. (2011). Libsvm: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology, 2(3), 1–27.CrossRefGoogle Scholar
  5. Chapelle, O., Scholkopf, B., & Zien, A. (Eds.). (2006). Semi-supervised learning. Cambridge: MIT Press.Google Scholar
  6. Chen, G., Song, Y., Wang, F., & Zhang, C. (2008). Semi-supervised multi-label learning by solving a sylvester equation. In Proceedings of the 8th SIAM international conference on data mining, Atlanta, GA (pp. 410–419).Google Scholar
  7. Chen, M., Zheng, A. X., & Weinberger, K. Q. (2013). Fast image tagging. In Proceedings of the 30th international conference of machine learning, Atlanta, GA (pp. 1274–1282).Google Scholar
  8. Kong, X., Ng, M. K., & Zhou, Z. H. (2013). Transductive multilabel learning via label set propagation. IEEE Transactions on Knowledge and Data Engineering, 25(3), 704–719.CrossRefGoogle Scholar
  9. Krijthe, J. H., & Loog, M. (2015). Implicitly constrained semi-supervised least squares classification. In Proceedings of the 14th international symposium on intelligent data analysis, Saint-Etienne, France (pp. 158–169).Google Scholar
  10. Li, Y. F., Kwok, J. T., & Zhou, Z. H. (2016). Towards safe semi-supervised learning for multivariate performance measures. In Proceedings of 30th AAAI conference on artificial intelligence, Phoenix, AZ (pp. 1816–1822).Google Scholar
  11. Li, Y. F., Zha, H. W., & Zhou, Z. H. (2017). Learning safe prediction for semi-supervised regression. In Proceedings of the 31th AAAI conference on artificial intelligence, San Francisco, CA (pp. 2217–2223).Google Scholar
  12. Li, Y. F., & Zhou, Z. H. (2015). Towards making unlabeled data never hurt. IEEE Transactions on Pattern Analysis and Machine Intelligence, 37(1), 175–188.CrossRefGoogle Scholar
  13. Liu, Y., Jin, R., & Yang, L. (2006). Semi-supervised multi-label learning by constrained non-negative matrix factorization. In Proceedings of the 21st national conference on artificial intelligence, Boston, MA (pp. 421–426).Google Scholar
  14. Read, J., Pfahringer, B., Holmes, G., & Frank, E. (2011). Classifier chains for multi-label classification. Machine Learning, 85(3), 333–359.MathSciNetCrossRefGoogle Scholar
  15. Srivastava, A. N., & Zane-Ulman, B. (2005). Discovering recurring anomalies in text reports regarding complex space systems. In Proceedings of the 25th IEEE aerospace conference (pp. 3853–3862).Google Scholar
  16. Sun, Y. Y., Zhang, Y., & Zhou, Z. H. (2010). Multi-label learning with weak label. In Proceedings of the 24th AAAI conference on artificial intelligence, Atlanta, GA (pp. 593–598).Google Scholar
  17. Tsoumakas, G., Katakis, I., & Vlahavas, I. (2009). Mining multi-label data. In O. Maimon & L. Rokach (Eds.), Data mining and knowledge discovery handbook (pp. 667–685). Boston, MA: Springer.CrossRefGoogle Scholar
  18. Wang, B., Tu, Z., & Tsotsos, J. K. (2013). Dynamic label propagation for semi-supervised multi-class multi-label classification. In Proceedings of the IEEE international conference on computer vision (pp. 425–432).Google Scholar
  19. Wu, L., Jin, R., & Jain, A. K. (2013). Tag completion for image retrieval. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(3), 716–727.CrossRefGoogle Scholar
  20. Yu, H. F., Jain, P., Kar, P., & Dhillon, I. S. (2014). Large-scale multi-label learning with missing labels. In Proceedings of the 31st international conference of machine learning, Beijing, China (pp. 593–601).Google Scholar
  21. Zhang, M. L., & Zhou, Z. H. (2007). Ml-knn: A lazy learning approach to multi-label learning. Pattern Recognition, 40(7), 2038–2048.CrossRefzbMATHGoogle Scholar
  22. Zhang, M. L., & Zhou, Z. H. (2014). A review on multi-label learning algorithms. IEEE Transactions on Knowledge and Data Engineering, 26(8), 1819–1837.CrossRefGoogle Scholar
  23. Zhao, F., & Guo, Y. (2015). Semi-supervised multi-label learning with incomplete labels. In Proceedings of the 24th international joint conference on artificial intelligence, Buenos Aires, Argentina (pp. 4062–4068).Google Scholar
  24. Zhu, G., Yan, S., & Ma, Y. (2010). Image tag refinement towards low-rank, content-tag prior and error sparsity. In Proceedings of the 18th ACM international conference on multimedia, Firenze, Italy (pp. 461–470).Google Scholar

Copyright information

© The Author(s) 2017

Authors and Affiliations

  1. 1.National Key Laboratory for Novel Software TechnologyNanjing UniversityNanjingChina

Personalised recommendations