Multi-label Feature Selection Method Combining Unbiased Hilbert-Schmidt Independence Criterion with Controlled Genetic Algorithm

  • Chang Liu
  • Quan Ma
  • Jianhua XuEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11304)


In multi-label learning, some redundant and irrelevant features increase computational cost and even degrade classification performance, which are widely dealt with via feature selection procedure. Unbiased Hilbert-Schmidt independence criterion (HSIC) is a kernel-based dependence measure between feature and label data, which has been combined with greedy search techniques (e.g., sequential forward selection) to search for a locally optimal feature subset. Alternatively, it is possible to achieve a globally optimal solution using genetic algorithm (GA), but usually the final solution prefers to select about a half of original features. In this paper, we propose a new GA variant to control the number of selected features (simply CGA). Then CGA is integrated with HSIC to formulate a novel multi-label feature selection technique (CGAHSIC) for a given size of feature subset. The effectiveness of our proposed CGAHSIC is validated through comparing with four existing algorithms, on four benchmark data sets, according to four indicative multi-label classification evaluation metrics (Hamming loss, accuracy, F1 and subset accuracy).


Multi-label learning Feature selection Hilbert-Schmidt independence criterion Sequential forward selection Genetic algorithm 


  1. 1.
    Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification, 2nd edn. Wiley, New York (2001)zbMATHGoogle Scholar
  2. 2.
    Herrera, F., Charte, F., Rivera, A.J., del Jesus, M.J.: Multilabel Classification: Problem Analysis: Metrics and Techniques. Springer, Switzerland (2016). Scholar
  3. 3.
    Tsoumakas, G., Katakis, I.: Multi-label classification: an overview. Int. J. Data Warehouse Min. 3(3), 1–13 (2007)CrossRefGoogle Scholar
  4. 4.
    Zhang, M., Zhou, Z.: A review on multi-label learning algorithms. IEEE Trans. Knowl. Data Eng. 26(8), 1338–1351 (2014)CrossRefGoogle Scholar
  5. 5.
    Kashef, S., Nezamabadi-pour, H., Nipour, B.: Multilabel feature selection: a comprehensiove review and guide experiments. WIREs Data Min. Knowl. Discov. 8(2), e1240 (2018)CrossRefGoogle Scholar
  6. 6.
    Pereira, R., Plastino, A., Zadrozny, B., Merschmann, L.H.C.: Categorizing feature selection methods for multi-label classification. Artif. Intell. Rev. 49(1), 57–78 (2018)CrossRefGoogle Scholar
  7. 7.
    Lee, J., Kim, D.W.: Feature selection for multi-label classification using multivariate mutual information. Pattern Recogn. Lett. 34(3), 349–357 (2013)CrossRefGoogle Scholar
  8. 8.
    Lee, J., Kim, D.W.: Fast multi-label feature selection based on information-theoretic feature ranking. Pattern Recogn. 48(9), 2761–2771 (2015)CrossRefGoogle Scholar
  9. 9.
    Lee, J., Kim, D.W.: SCLS: multi-label feature selection based on scalable criterion for large label set. Pattern Recogn. 66, 342–352 (2017)MathSciNetCrossRefGoogle Scholar
  10. 10.
    Lin, Y., Hu, Q., Liu, J., Duan, J.: Multi-label feature selection based on max-dependency and min-redundancy. Neurocompting 168, 92–103 (2015)CrossRefGoogle Scholar
  11. 11.
    Spolaor, N., Chermana, E.A., Monarda, M.C., Lee, H.D.: A comparison of multi-label feature selection methods using the problem transformation approach. Eletronic Notes Theoret. Comput. Sci. 292, 135–151 (2013)CrossRefGoogle Scholar
  12. 12.
    Spolaor, N., Monard, M.C., Tsoumakas, G., Lee, H.D.: A systematic review of multi-label feature selection and a new method based on label construction. Neurocomputing 180, 3–15 (2016)CrossRefGoogle Scholar
  13. 13.
    Chen, W., Yan, J., Zhang, B., Chen, Z., Yang, Q.: Document transformation for multi-label feature selection text categorization. In: 7th IEEE International Conference on Data Mining (ICDM2007), pp. 451–456. IEEE Press, New York (2007)Google Scholar
  14. 14.
    Pupo, O.G.R., Morell, C., Soto, S.V.: ReliefF-ML: an extension of relieff algorithm to multi-label learning. In: Ruiz-Shulcloper, J., Sanniti di Baja, G. (eds.) CIARP 2013. LNCS, vol. 8259, pp. 528–535. Springer, Heidelberg (2013). Scholar
  15. 15.
    Reyes, O., Morell, C., Ventura, S.: Scalable extensions of the relieff algorithm for weighting and selecting features on the multi-label learning context. Neurocomputing 161, 168–182 (2015)CrossRefGoogle Scholar
  16. 16.
    Spolaor, N., Cherman, E., Monard, M., Lee, H.: Relief for multilabel feature selection. In: 2013 Brazlian Conference on Intelligent Systems (BRACIS2013), pp. 6–11. IEEE Press, New York (2013)Google Scholar
  17. 17.
    Kong, D., Ding, C., Huang, H., Zhao, H.: Multi-label relieff and f-statistics feature selection for image annotation. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR2012), pp. 2352–2359. IEEE Press, New York (2012)Google Scholar
  18. 18.
    Lewis, D.D., Yang, Y., Rose, T.G., Li, F.: RCV1: a new benchmark collection for text categorization research. J. Mach. Learn. Res. 5, 361–397 (2004)Google Scholar
  19. 19.
    Xu, J.: Effective and efficient multi-label feature selection approaches via modifying Hilbert-Schmidt independence criterion. In: Hirose, A., Ozawa, S., Doya, K., Ikeda, K., Lee, M., Liu, D. (eds.) ICONIP 2016. LNCS, vol. 9949, pp. 385–395. Springer, Cham (2016). Scholar
  20. 20.
    Jungjit, S., Freitas, A.A., Michaelis, M., Cinatl, J.: A multi-label correlation based feature selection method for the classification of neuroblastoma microarray data. In: 12th Industrial Conference on Data Mining (ICDM2012): Workshop on Data Mining and Life Sciences (DMLS2012), pp. 149–157 (2012)Google Scholar
  21. 21.
    Jungjit, S., Freitas, A.A.: A new genetic algorithm for multi-label correlation-based feature selection. In: 23rd European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning (ESANN2015), pp. 285–290 (2015)Google Scholar
  22. 22.
    Lee, J., Kim, D.W.: Memetic feature selection algorithm for multi-label classification. Inf. Sci. 293, 80–95 (2015)CrossRefGoogle Scholar
  23. 23.
    Gretton, A., Bousquet, O., Smola, A., Schölkopf, B.: Measuring statistical dependence with Hilbert-Schmidt norms. In: Jain, S., Simon, H.U., Tomita, E. (eds.) ALT 2005. LNCS (LNAI), vol. 3734, pp. 63–77. Springer, Heidelberg (2005). Scholar
  24. 24.
    Song, L., Smola, A., Bedo, A.G.J., Borgwardt, K.: Feature selection via dependence maximization. J. Mach. Learn. Res. 13, 1393–1434 (2012)MathSciNetzbMATHGoogle Scholar
  25. 25.
    Yin, J., Tao, T., Xu, J.: A multi-label feature selection algorithm based on multi -objective optimization. In: 27th IEEE International Joint Conference on Neural Networks (IJCNN2015), pp. 1–7. IEEE Press, New York (2015)Google Scholar
  26. 26.
    Scholkopf, B., Smola, A.J.: Learning with Kernels: Support Vectors, Regulization, Optimization and Beyond. MIT Press, Cambridge (2001)Google Scholar
  27. 27.
    Holland, J.: Adaptation in Nature and Artificial Systems. MIT Press, Cambridge (1992)Google Scholar
  28. 28.
    Oh, I.S., Lee, J.S., Moon, B.R.: Hybrid genetic algorithms for feature selection. IEEE Trans. Pattern Anal. Mach. Intell. 26(11), 1424–1437 (2004)CrossRefGoogle Scholar
  29. 29.
    Zhang, M., Zhou, Z.: Ml-knn: a lazy learning approach to multi-label learning. Pattern Recognit. 40(7), 2038–2048 (2007)CrossRefGoogle Scholar
  30. 30.
    Demsar, J.: Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res. 7, 1–30 (2006)MathSciNetzbMATHGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  1. 1.School of Computer Science and TechnologyNanjing Normal UniversityNanjingChina

Personalised recommendations