Advertisement

Frontiers of Computer Science

, Volume 12, Issue 2, pp 191–202 | Cite as

Binary relevance for multi-label learning: an overview

  • Min-Ling Zhang
  • Yu-Kun Li
  • Xu-Ying Liu
  • Xin Geng
Review Article

Abstract

Multi-label learning deals with problems where each example is represented by a single instance while being associated with multiple class labels simultaneously. Binary relevance is arguably the most intuitive solution for learning from multi-label examples. It works by decomposing the multi-label learning task into a number of independent binary learning tasks (one per class label). In view of its potential weakness in ignoring correlations between labels, many correlation-enabling extensions to binary relevance have been proposed in the past decade. In this paper, we aim to review the state of the art of binary relevance from three perspectives. First, basic settings for multi-label learning and binary relevance solutions are briefly summarized. Second, representative strategies to provide binary relevancewith label correlation exploitation abilities are discussed. Third, some of our recent studies on binary relevance aimed at issues other than label correlation exploitation are introduced. As a conclusion, we provide suggestions on future research directions.

Keywords

machine learning multi-label learning binary relevance label correlation class-imbalance relative labeling-importance 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Notes

Acknowledgements

The authors would like to thank the associate editor and anonymous reviewers for their helpful comments and suggestions. This work was supported by the National Natural Science Foundation of China (Grant Nos. 61573104, 61622203), the Natural Science Foundation of Jiangsu Province (BK20141340), the Fundamental Research Funds for the Central Universities (2242017K40140), and partially supported by the Collaborative Innovation Center of Novel Software Technology and Industrialization.

Supplementary material

11704_2017_7031_MOESM1_ESM.ppt (224 kb)
Binary relevance for multi-label learning: an overview

References

  1. 1.
    Zhang M-L, Zhou Z-H. A review on multi-label learning algorithms. IEEE Transactions on Knowledge and Data Engineering, 2014, 26(8): 1819–1837CrossRefGoogle Scholar
  2. 2.
    Zhou Z-H, Zhang M-L. Multi-label learning. In: Sammut C, Webb G I, eds. Encyclopedia of Machine Learning and Data Mining. Berlin: Springer, 2016, 1–8Google Scholar
  3. 3.
    Schapire R E, Singer Y. Boostexter: a boosting-based system for text categorization. Machine Learning, 2000, 39(2–3): 135–168CrossRefzbMATHGoogle Scholar
  4. 4.
    Cabral R S, De la Torre F, Costeira J P, Bernardino A. Matrix completion for multi-label image classification. In: Proceedings of Advances in Neural Information Processing Systems. 2011, 190–198Google Scholar
  5. 5.
    Sanden C, Zhang J Z. Enhancing multi-label music genre classification through ensemble techniques. In: Proceedings of the 34th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. 2011, 705–714Google Scholar
  6. 6.
    Barutcuoglu Z, Schapire R E, Troyanskaya O G. Hierarchical multilabel prediction of gene function. Bioinformatics, 2006, 22(7): 830–836CrossRefGoogle Scholar
  7. 7.
    Qi G-J, Hua X-S, Rui Y, Tang J, Mei T, Zhang H-J. Correlative multilabel video annotation. In: Proceedings of the 15th ACM International Conference on Multimedia. 2007, 17–26Google Scholar
  8. 8.
    Tang L, Rajan S, Narayanan V K. Large scale multi-label classification via metalabeler. In: Proceedings of the 19th International Conference on World Wide Web. 2009, 211–220Google Scholar
  9. 9.
    Boutell M R, Luo J, Shen X, Brown C M. Learning multi-label scene classification. Pattern Recognition, 2004, 37(9): 1757–1771CrossRefGoogle Scholar
  10. 10.
    Tsoumakas G, Katakis I, Vlahavas I. Mining multi-label data. In: Maimon O, Rokach L, eds. Data Mining and Knowledge Discovery Handbook. Berlin: Springer, 2010, 667–686Google Scholar
  11. 11.
    Gibaja E, Ventura S. A tutorial on multilabel learning. ACM Computing Surveys, 2015, 47(3): 52CrossRefGoogle Scholar
  12. 12.
    Read J, Pfahringer B, Holmes G, Frank E. Classifier chains for multilabel classification. In: Proceedings of Joint European Conference on Machine Learning and Knowledge Discovery in Databases. 2009, 254–269CrossRefGoogle Scholar
  13. 13.
    Dembczyński K, Cheng W, Hüllermeier E. Bayes optimal multilabel classification via probabilistic classifier chains. In: Proceedings of the 27th International Conference on Machine Learning. 2010, 279–286Google Scholar
  14. 14.
    Read J, Pfahringer B, Holmes G, Frank E. Classifier chains for multilabel classification. Machine Learning, 2011, 85(3): 333–359MathSciNetCrossRefGoogle Scholar
  15. 15.
    Kumar A, Vembu S, Menon A K, Elkan C. Learning and inference in probabilistic classifier chains with beam search. In: Proceedings of Joint European Conference on Machine Learning and Knowledge Discovery in Databases. 2012, 665–680CrossRefGoogle Scholar
  16. 16.
    Li N, Zhou Z-H. Selective ensemble of classifier chains. In: Proceedings of International Workshop on Multiple Classifier Systems. 2013, 146–156CrossRefGoogle Scholar
  17. 17.
    Senge R, del Coz J J, Hüllermeier E. Rectifying classifier chains for multi-label classification. In: Proceedings of the 15th German Workshop on Learning, Knowledge, and Adaptation. 2013, 162–169Google Scholar
  18. 18.
    Mena D, Montañés E, Quevedo J R, del Coz J J. A family of admissible heuristics for A* to perform inference in probabilistic classifier chains. Machine Learning, 2017, 106(1): 143–169MathSciNetCrossRefzbMATHGoogle Scholar
  19. 19.
    Godbole S, Sarawagi S. Discriminative methods for multi-labeled classification. In: Proceedings of Pacific-Asia Conference on Knowledge Discovery and Data Mining. 2004, 22–30CrossRefGoogle Scholar
  20. 20.
    Montañés E, Quevedo J R, del Coz J J. Aggregating independent and dependent models to learn multi-label classifiers. In: proceedings of Joint European Conference on Machine Learning and Knowledge Discovery in Databases. 2011, 484–500CrossRefGoogle Scholar
  21. 21.
    Montañés E, Senge R, Barranquero J, Quevedo J R, del Coz J J, Hüllermeier E. Dependent binary relevance models for multi-label classification. Pattern Recognition, 2014, 47(3): 1494–1508CrossRefGoogle Scholar
  22. 22.
    Tahir M A, Kittler J, Bouridane A. Multi-label classification using stacked spectral kernel discriminant analysis. Neurocomputing, 2016, 171: 127–137CrossRefGoogle Scholar
  23. 23.
    Loza Mencía E, Janssen F. Learning rules for multi-label classification: a stacking and a separate-and-conquer approach. Machine Learning, 2016, 105(1): 77–126MathSciNetCrossRefzbMATHGoogle Scholar
  24. 24.
    Tsoumakas G, Dimou A, Spyromitros E, Mezaris V, Kompatsiaris I, Vlahavas I. Correlation-based pruning of stacked binary relevance models for multi-label learning. In: Proceedings of the 1st International Workshop on Learning from Multi-Label Data. 2009, 101–116Google Scholar
  25. 25.
    Zhang M-L, Zhang K. Multi-label learning by exploiting label dependency. In: Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2010, 999–1007Google Scholar
  26. 26.
    Alessandro A, Corani G, Mauá D, Gabaglio S. An ensemble of Bayesian networks for multilabel classification. In: Proceedings of the 23rd International Joint Conference on Artificial Intelligence. 2013, 1220–1225Google Scholar
  27. 27.
    Sucar L E, Bielza C, Morales E F, Hernandez-Leal P, Zaragoza J H, Larrañaga P. Multi-label classification with bayesian network-based chain classifiers. Pattern Recognition Letters, 2014, 41: 14–22CrossRefGoogle Scholar
  28. 28.
    Li Y-K, Zhang M-L. Enhancing binary relevance for multi-label learning with controlled label correlations exploitation. In: Proceedings of Pacific Rim International Conference on Artificial Intelligence. 2014, 91–103Google Scholar
  29. 29.
    Alali A, Kubat M. Prudent: a pruned and confident stacking approach for multi-label classification. IEEE Transactions on Knowledge and Data Engineering, 2015, 27(9): 2480–2493CrossRefGoogle Scholar
  30. 30.
    Petterson J, Caetano T. Reverse multi-label learning. In: Proceedings of the Neural Information Processing Systems Comference. 2010, 1912–1920Google Scholar
  31. 31.
    Spyromitros-Xioufis E, Spiliopoulou M, Tsoumakas G, Vlahavas I. Dealing with concept drift and class imbalance in multi-label stream classification. In: Proceedings of the 22nd International Joint Conference on Artificial Intelligence. 2011, 1583–1588Google Scholar
  32. 32.
    Tahir M A, Kittler J, Yan F. Inverse random under sampling for class imbalance problem and its application to multi-label classification. Pattern Recognition, 2012, 45(10): 3738–3750CrossRefGoogle Scholar
  33. 33.
    Quevedo J R, Luaces O, Bahamonde A. Multilabel classifiers with a probabilistic thresholding strategy. Pattern Recognition, 2012, 45(2): 876–883zbMATHGoogle Scholar
  34. 34.
    Pillai I, Fumera G, Roli F. Threshold optimisation for multi-label classifiers. Pattern Recognition, 2013, 46(7): 2055–2065CrossRefzbMATHGoogle Scholar
  35. 35.
    Dembczynski K, Jachnik A, Kotlowski W, Waegeman W, Hüllermeier E. Optimizing the F-measure in multi-label classification: plug-in rule approach versus structured loss minimization. In: Proceedings of the 30th International Conference on Machine Learning. 2013, 1130–1138Google Scholar
  36. 36.
    Charte F, Rivera A J, del Jesus M J, Herrera F. Addressing imbalance in multilabel classification: Measures and random resampling algorithms. Neurocomputing, 2015, 163: 3–16CrossRefGoogle Scholar
  37. 37.
    Charte F, Rivera A J, del Jesus M J, Herrera F. Mlsmote: approaching imbalanced multilabel learning through synthetic instance generation. Knowledge-Based Systems, 2015, 89: 385–397CrossRefGoogle Scholar
  38. 38.
    Zhang M-L, Li Y-K, Liu X-Y. Towards class-imbalance aware multilabel learning. In: Proceedings of the 24th International Joint Conference on Artificial Intelligence. 2015, 4041–4047Google Scholar
  39. 39.
    Wu B, Lyu S, Ghanem B. Constrained submodular minimization for missing labels and class imbalance in multi-label learning. In: Proceedings of the 30th AAAI Conference on Artificial Intelligence. 2016, 2229–2236Google Scholar
  40. 40.
    Cheng W, Dembczynski K J, Hüllermeier E. Graded multilabel classification: the ordinal case. In: Proceedings of the 27th International Conference on Machine Learning. 2010, 223–230Google Scholar
  41. 41.
    Xu M, Li Y-F, Zhou Z-H. Multi-label learning with PRO loss. In: Proceedings of the 27th AAAI Conference on Artificial Intelligence. 2013, 998–1004Google Scholar
  42. 42.
    Li Y-K, Zhang M-L, Geng X. Leveraging implicit relative labelingimportance information for effective multi-label learning. In: Proceedings of the 15th IEEE International Conference on Data Mining. 2015, 251–260Google Scholar
  43. 43.
    Geng X, Yin C, Zhou Z-H. Facial age estimation by learning from label distributions. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013, 35(10): 2401–2412CrossRefGoogle Scholar
  44. 44.
    Geng X. Label distribution learning. IEEE Transactions on Knowledge and Data Engineering, 2016, 28(7): 1734–1748CrossRefGoogle Scholar
  45. 45.
    Gao N, Huang S-J, Chen S. Multi-label active learning by model guided distribution matching. Frontiers of Computer Science, 2016, 10(5): 845–855CrossRefGoogle Scholar
  46. 46.
    Dembczyński K, Waegeman W, Cheng W, Hüllermeier E. On label dependence and loss minimization in multi-label classification. Machine Learning, 2012, 88(1–2): 5–45MathSciNetCrossRefzbMATHGoogle Scholar
  47. 47.
    Gao W, Zhou Z-H. On the consistency of multi-label learning. In: Proceedings of the 24th Annual Conference on Learning Theory. 2011, 341–358Google Scholar
  48. 48.
    Sun Y-Y, Zhang Y, Zhou Z-H. Multi-label learning with weak label. In: Proceedings of the 24th AAAI Conference on Artificial Intelligence. 2010, 593–598Google Scholar
  49. 49.
    Xu M, Jin R, Zhou Z-H. Speedup matrix completion with side information: application to multi-label learning. In: Proceedings of the Neural Information Processing Systems Conference. 2013, 2301–2309Google Scholar
  50. 50.
    Cabral R, De la Torre F, Costeira J P, Bernardino A.Matrix completion for weakly-supervised multi-label image classification. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(1): 121–135CrossRefGoogle Scholar
  51. 51.
    Senge R, del Coz J J, Hüllermeier E. On the problem of error propagation in classifier chains for multi-label classification. In: Spiliopoulou M, Schmidt-Thieme L, Janning R, eds. Data Analysis, Machine Learning and Knowledge Discovery. Berlin: Springer, 2014. 163–170CrossRefGoogle Scholar
  52. 52.
    Zhou Z-H. Ensemble Methods: Foundations and Algorithms. Boca Raton, FL: Chap-man & Hall/CRC, 2012Google Scholar
  53. 53.
    Koller D, Friedman N. Probabilistic Graphical Models: Principles and Techniques. Cambridge, MA: MIT Press, 2009zbMATHGoogle Scholar
  54. 54.
    Koivisto M. Advances in exact Bayesian structure discovery in Bayesian networks. In: Proceedings of the 22nd Conference on Uncertainty in Artificial Intelligence. 2006, 241–248Google Scholar
  55. 55.
    Smith V, Yu J, Smulders T, Hartemink A, Jarvis E. Computational inference of neural information flow networks. PLoS Computational Biology, 2006, 2: 1436–1449CrossRefGoogle Scholar
  56. 56.
    Murphy K. Software for graphical models: a review. ISBA Bulletin, 2007, 14(4): 13–15Google Scholar
  57. 57.
    Tsoumakas G, Spyromitros-Xioufis E, Vilcek J, Vlahavas I. MULAN: a java library for multi-label learning. Journal of Machine Learning Research, 2011, 12: 2411–2414MathSciNetzbMATHGoogle Scholar
  58. 58.
    He H, Garcia E A. Learning from imbalanced data. IEEE Transactions on Knowledge and Data Engineering, 2009, 21(9): 1263–1284CrossRefGoogle Scholar
  59. 59.
    Wang S, Yao X. Multiclass imbalance problems: analysis and potential solutions. IEEE Transactions on Systems, Man, and Cybernetics-Part B: Cybernetics, 2012, 42(4): 1119–1130CrossRefGoogle Scholar
  60. 60.
    Liu X-Y, Li Q-Q, Zhou Z-H. Learning imbalanced multi-class data with optimal dichotomy weights. In Proceedings of the 13th IEEE International Conference on Data Mining. 2013, 478–487Google Scholar
  61. 61.
    Abdi L, Hashemi S. To combat multi-class imbalanced problems by means of over-sampling techniques. IEEE Transactions on Knowledge and Data Engineering, 2016, 28(1): 238–251CrossRefGoogle Scholar
  62. 62.
    Zhou D, Bousquet O, Lal T N, Weston J, Schölkopf B. Learning with local and global consistency. In: Proceedings of the Neural Information Processing Systems Conference. 2004, 284–291Google Scholar
  63. 63.
    Zhu X, Goldberg A B. Introduction to semi-supervised learning. In: Brachman R, Stone P, eds. Synthesis Lectures to Artificial Intelligence and Machine Learning. San Francisco, CA: Morgan & Claypool Publishers, 2009, 1–130Google Scholar
  64. 64.
    Della Pietra S, Della Pietra V, Lafferty J. Inducing features of random fields. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1997, 19(4): 380–393CrossRefGoogle Scholar
  65. 65.
    Zhang M-L, Wu L. LIFT: multi-label learning with label-specific features. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(1): 107–120MathSciNetCrossRefGoogle Scholar
  66. 66.
    Xu X, Yang X, Yu H, Yu D-J, Yang J, Tsang E C C. Multi-label learning with label-specific feature reduction. Knowledge-Based Systems, 2016, 104: 52–61CrossRefGoogle Scholar
  67. 67.
    Huang J, Li G, Huang Q, Wu X. Learning label-specific features and class-dependent labels for multi-label classification. IEEE Transactions on Knowledge and Data Engineering, 2016, 28(12): 3309–3323CrossRefGoogle Scholar
  68. 68.
    Weston J, Bengio S, Usunier N. WSABIE: scaling up to large vocabulary image annotation. In: Proceedings of the 22nd International Joint Conference on Artificial Intelligence. 2011, 2764–2770Google Scholar
  69. 69.
    Agrawal R, Gupta A, Prabhu Y, Varma M. Multi-label learning with millions of labels: Recommending advertiser bid phrases for web pages. In: Proceedings of the 22nd International Conference on World Wide Web. 2013, 13–24CrossRefGoogle Scholar
  70. 70.
    Xu C, Tao D, Xu C. Robust extreme multi-label learning. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2016, 1275–1284CrossRefGoogle Scholar
  71. 71.
    Jain H, Prabhu Y, Varma M. Extreme multi-label loss functions for recommendation, tagging, ranking & other missing label applications. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2016, 935–944CrossRefGoogle Scholar
  72. 72.
    Zhou W J, Yu Y, Zhang M-L. Binary linear compression for multi-label classification. In: Proceedings of the 26th International Joint Conference on Artificial Intelligence. 2017Google Scholar

Copyright information

© Higher Education Press and Springer-Verlag GmbH Germany, part of Springer Nature 2018

Authors and Affiliations

  • Min-Ling Zhang
    • 1
    • 2
    • 3
  • Yu-Kun Li
    • 1
    • 2
    • 3
  • Xu-Ying Liu
    • 1
    • 2
    • 3
  • Xin Geng
    • 1
    • 2
    • 3
  1. 1.School of Computer Science and EngineeringSoutheast UniversityNanjingChina
  2. 2.Key Laboratory of Computer Network and Information Integration (Southeast University), Ministry of EducationNanjingChina
  3. 3.Collaborative Innovation Center forWireless Communications TechnologyNanjingChina

Personalised recommendations