Skip to main content

Multi-label Feature Selection Method Combining Unbiased Hilbert-Schmidt Independence Criterion with Controlled Genetic Algorithm

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 11304))

Abstract

In multi-label learning, some redundant and irrelevant features increase computational cost and even degrade classification performance, which are widely dealt with via feature selection procedure. Unbiased Hilbert-Schmidt independence criterion (HSIC) is a kernel-based dependence measure between feature and label data, which has been combined with greedy search techniques (e.g., sequential forward selection) to search for a locally optimal feature subset. Alternatively, it is possible to achieve a globally optimal solution using genetic algorithm (GA), but usually the final solution prefers to select about a half of original features. In this paper, we propose a new GA variant to control the number of selected features (simply CGA). Then CGA is integrated with HSIC to formulate a novel multi-label feature selection technique (CGAHSIC) for a given size of feature subset. The effectiveness of our proposed CGAHSIC is validated through comparing with four existing algorithms, on four benchmark data sets, according to four indicative multi-label classification evaluation metrics (Hamming loss, accuracy, F1 and subset accuracy).

This work was supported by Natural Science Foundation of China under grant No. 61273246.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    http://mulan.sourceforge.net/datasets-mlc.html.

  2. 2.

    http://cse.seu.edu.cn/PersonalPage/zhangml.

References

  1. Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification, 2nd edn. Wiley, New York (2001)

    MATH  Google Scholar 

  2. Herrera, F., Charte, F., Rivera, A.J., del Jesus, M.J.: Multilabel Classification: Problem Analysis: Metrics and Techniques. Springer, Switzerland (2016). https://doi.org/10.1007/978-3-319-41111-8

    Book  Google Scholar 

  3. Tsoumakas, G., Katakis, I.: Multi-label classification: an overview. Int. J. Data Warehouse Min. 3(3), 1–13 (2007)

    Article  Google Scholar 

  4. Zhang, M., Zhou, Z.: A review on multi-label learning algorithms. IEEE Trans. Knowl. Data Eng. 26(8), 1338–1351 (2014)

    Article  Google Scholar 

  5. Kashef, S., Nezamabadi-pour, H., Nipour, B.: Multilabel feature selection: a comprehensiove review and guide experiments. WIREs Data Min. Knowl. Discov. 8(2), e1240 (2018)

    Article  Google Scholar 

  6. Pereira, R., Plastino, A., Zadrozny, B., Merschmann, L.H.C.: Categorizing feature selection methods for multi-label classification. Artif. Intell. Rev. 49(1), 57–78 (2018)

    Article  Google Scholar 

  7. Lee, J., Kim, D.W.: Feature selection for multi-label classification using multivariate mutual information. Pattern Recogn. Lett. 34(3), 349–357 (2013)

    Article  Google Scholar 

  8. Lee, J., Kim, D.W.: Fast multi-label feature selection based on information-theoretic feature ranking. Pattern Recogn. 48(9), 2761–2771 (2015)

    Article  Google Scholar 

  9. Lee, J., Kim, D.W.: SCLS: multi-label feature selection based on scalable criterion for large label set. Pattern Recogn. 66, 342–352 (2017)

    Article  MathSciNet  Google Scholar 

  10. Lin, Y., Hu, Q., Liu, J., Duan, J.: Multi-label feature selection based on max-dependency and min-redundancy. Neurocompting 168, 92–103 (2015)

    Article  Google Scholar 

  11. Spolaor, N., Chermana, E.A., Monarda, M.C., Lee, H.D.: A comparison of multi-label feature selection methods using the problem transformation approach. Eletronic Notes Theoret. Comput. Sci. 292, 135–151 (2013)

    Article  Google Scholar 

  12. Spolaor, N., Monard, M.C., Tsoumakas, G., Lee, H.D.: A systematic review of multi-label feature selection and a new method based on label construction. Neurocomputing 180, 3–15 (2016)

    Article  Google Scholar 

  13. Chen, W., Yan, J., Zhang, B., Chen, Z., Yang, Q.: Document transformation for multi-label feature selection text categorization. In: 7th IEEE International Conference on Data Mining (ICDM2007), pp. 451–456. IEEE Press, New York (2007)

    Google Scholar 

  14. Pupo, O.G.R., Morell, C., Soto, S.V.: ReliefF-ML: an extension of relieff algorithm to multi-label learning. In: Ruiz-Shulcloper, J., Sanniti di Baja, G. (eds.) CIARP 2013. LNCS, vol. 8259, pp. 528–535. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-41827-3_66

    Chapter  Google Scholar 

  15. Reyes, O., Morell, C., Ventura, S.: Scalable extensions of the relieff algorithm for weighting and selecting features on the multi-label learning context. Neurocomputing 161, 168–182 (2015)

    Article  Google Scholar 

  16. Spolaor, N., Cherman, E., Monard, M., Lee, H.: Relief for multilabel feature selection. In: 2013 Brazlian Conference on Intelligent Systems (BRACIS2013), pp. 6–11. IEEE Press, New York (2013)

    Google Scholar 

  17. Kong, D., Ding, C., Huang, H., Zhao, H.: Multi-label relieff and f-statistics feature selection for image annotation. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR2012), pp. 2352–2359. IEEE Press, New York (2012)

    Google Scholar 

  18. Lewis, D.D., Yang, Y., Rose, T.G., Li, F.: RCV1: a new benchmark collection for text categorization research. J. Mach. Learn. Res. 5, 361–397 (2004)

    Google Scholar 

  19. Xu, J.: Effective and efficient multi-label feature selection approaches via modifying Hilbert-Schmidt independence criterion. In: Hirose, A., Ozawa, S., Doya, K., Ikeda, K., Lee, M., Liu, D. (eds.) ICONIP 2016. LNCS, vol. 9949, pp. 385–395. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46675-0_42

    Chapter  Google Scholar 

  20. Jungjit, S., Freitas, A.A., Michaelis, M., Cinatl, J.: A multi-label correlation based feature selection method for the classification of neuroblastoma microarray data. In: 12th Industrial Conference on Data Mining (ICDM2012): Workshop on Data Mining and Life Sciences (DMLS2012), pp. 149–157 (2012)

    Google Scholar 

  21. Jungjit, S., Freitas, A.A.: A new genetic algorithm for multi-label correlation-based feature selection. In: 23rd European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning (ESANN2015), pp. 285–290 (2015)

    Google Scholar 

  22. Lee, J., Kim, D.W.: Memetic feature selection algorithm for multi-label classification. Inf. Sci. 293, 80–95 (2015)

    Article  Google Scholar 

  23. Gretton, A., Bousquet, O., Smola, A., Schölkopf, B.: Measuring statistical dependence with Hilbert-Schmidt norms. In: Jain, S., Simon, H.U., Tomita, E. (eds.) ALT 2005. LNCS (LNAI), vol. 3734, pp. 63–77. Springer, Heidelberg (2005). https://doi.org/10.1007/11564089_7

    Chapter  Google Scholar 

  24. Song, L., Smola, A., Bedo, A.G.J., Borgwardt, K.: Feature selection via dependence maximization. J. Mach. Learn. Res. 13, 1393–1434 (2012)

    MathSciNet  MATH  Google Scholar 

  25. Yin, J., Tao, T., Xu, J.: A multi-label feature selection algorithm based on multi -objective optimization. In: 27th IEEE International Joint Conference on Neural Networks (IJCNN2015), pp. 1–7. IEEE Press, New York (2015)

    Google Scholar 

  26. Scholkopf, B., Smola, A.J.: Learning with Kernels: Support Vectors, Regulization, Optimization and Beyond. MIT Press, Cambridge (2001)

    Google Scholar 

  27. Holland, J.: Adaptation in Nature and Artificial Systems. MIT Press, Cambridge (1992)

    Google Scholar 

  28. Oh, I.S., Lee, J.S., Moon, B.R.: Hybrid genetic algorithms for feature selection. IEEE Trans. Pattern Anal. Mach. Intell. 26(11), 1424–1437 (2004)

    Article  Google Scholar 

  29. Zhang, M., Zhou, Z.: Ml-knn: a lazy learning approach to multi-label learning. Pattern Recognit. 40(7), 2038–2048 (2007)

    Article  Google Scholar 

  30. Demsar, J.: Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res. 7, 1–30 (2006)

    MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jianhua Xu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Liu, C., Ma, Q., Xu, J. (2018). Multi-label Feature Selection Method Combining Unbiased Hilbert-Schmidt Independence Criterion with Controlled Genetic Algorithm. In: Cheng, L., Leung, A., Ozawa, S. (eds) Neural Information Processing. ICONIP 2018. Lecture Notes in Computer Science(), vol 11304. Springer, Cham. https://doi.org/10.1007/978-3-030-04212-7_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-04212-7_1

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-04211-0

  • Online ISBN: 978-3-030-04212-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics