Multi-label Feature Selection Method Combining Unbiased Hilbert-Schmidt Independence Criterion with Controlled Genetic Algorithm

Liu, Chang; Ma, Quan; Xu, Jianhua

doi:10.1007/978-3-030-04212-7_1

Multi-label Feature Selection Method Combining Unbiased Hilbert-Schmidt Independence Criterion with Controlled Genetic Algorithm

Chang Liu¹⁶,
Quan Ma¹⁶ &
Jianhua Xu¹⁶

Conference paper
First Online: 17 November 2018

2249 Accesses
5 Citations

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 11304))

Abstract

In multi-label learning, some redundant and irrelevant features increase computational cost and even degrade classification performance, which are widely dealt with via feature selection procedure. Unbiased Hilbert-Schmidt independence criterion (HSIC) is a kernel-based dependence measure between feature and label data, which has been combined with greedy search techniques (e.g., sequential forward selection) to search for a locally optimal feature subset. Alternatively, it is possible to achieve a globally optimal solution using genetic algorithm (GA), but usually the final solution prefers to select about a half of original features. In this paper, we propose a new GA variant to control the number of selected features (simply CGA). Then CGA is integrated with HSIC to formulate a novel multi-label feature selection technique (CGAHSIC) for a given size of feature subset. The effectiveness of our proposed CGAHSIC is validated through comparing with four existing algorithms, on four benchmark data sets, according to four indicative multi-label classification evaluation metrics (Hamming loss, accuracy, F1 and subset accuracy).

This work was supported by Natural Science Foundation of China under grant No. 61273246.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

References

Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification, 2nd edn. Wiley, New York (2001)
MATH Google Scholar
Herrera, F., Charte, F., Rivera, A.J., del Jesus, M.J.: Multilabel Classification: Problem Analysis: Metrics and Techniques. Springer, Switzerland (2016). https://doi.org/10.1007/978-3-319-41111-8
Book Google Scholar
Tsoumakas, G., Katakis, I.: Multi-label classification: an overview. Int. J. Data Warehouse Min. 3(3), 1–13 (2007)
Article Google Scholar
Zhang, M., Zhou, Z.: A review on multi-label learning algorithms. IEEE Trans. Knowl. Data Eng. 26(8), 1338–1351 (2014)
Article Google Scholar
Kashef, S., Nezamabadi-pour, H., Nipour, B.: Multilabel feature selection: a comprehensiove review and guide experiments. WIREs Data Min. Knowl. Discov. 8(2), e1240 (2018)
Article Google Scholar
Pereira, R., Plastino, A., Zadrozny, B., Merschmann, L.H.C.: Categorizing feature selection methods for multi-label classification. Artif. Intell. Rev. 49(1), 57–78 (2018)
Article Google Scholar
Lee, J., Kim, D.W.: Feature selection for multi-label classification using multivariate mutual information. Pattern Recogn. Lett. 34(3), 349–357 (2013)
Article Google Scholar
Lee, J., Kim, D.W.: Fast multi-label feature selection based on information-theoretic feature ranking. Pattern Recogn. 48(9), 2761–2771 (2015)
Article Google Scholar
Lee, J., Kim, D.W.: SCLS: multi-label feature selection based on scalable criterion for large label set. Pattern Recogn. 66, 342–352 (2017)
Article MathSciNet Google Scholar
Lin, Y., Hu, Q., Liu, J., Duan, J.: Multi-label feature selection based on max-dependency and min-redundancy. Neurocompting 168, 92–103 (2015)
Article Google Scholar
Spolaor, N., Chermana, E.A., Monarda, M.C., Lee, H.D.: A comparison of multi-label feature selection methods using the problem transformation approach. Eletronic Notes Theoret. Comput. Sci. 292, 135–151 (2013)
Article Google Scholar
Spolaor, N., Monard, M.C., Tsoumakas, G., Lee, H.D.: A systematic review of multi-label feature selection and a new method based on label construction. Neurocomputing 180, 3–15 (2016)
Article Google Scholar
Chen, W., Yan, J., Zhang, B., Chen, Z., Yang, Q.: Document transformation for multi-label feature selection text categorization. In: 7th IEEE International Conference on Data Mining (ICDM2007), pp. 451–456. IEEE Press, New York (2007)
Google Scholar
Pupo, O.G.R., Morell, C., Soto, S.V.: ReliefF-ML: an extension of relieff algorithm to multi-label learning. In: Ruiz-Shulcloper, J., Sanniti di Baja, G. (eds.) CIARP 2013. LNCS, vol. 8259, pp. 528–535. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-41827-3_66
Chapter Google Scholar
Reyes, O., Morell, C., Ventura, S.: Scalable extensions of the relieff algorithm for weighting and selecting features on the multi-label learning context. Neurocomputing 161, 168–182 (2015)
Article Google Scholar
Spolaor, N., Cherman, E., Monard, M., Lee, H.: Relief for multilabel feature selection. In: 2013 Brazlian Conference on Intelligent Systems (BRACIS2013), pp. 6–11. IEEE Press, New York (2013)
Google Scholar
Kong, D., Ding, C., Huang, H., Zhao, H.: Multi-label relieff and f-statistics feature selection for image annotation. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR2012), pp. 2352–2359. IEEE Press, New York (2012)
Google Scholar
Lewis, D.D., Yang, Y., Rose, T.G., Li, F.: RCV1: a new benchmark collection for text categorization research. J. Mach. Learn. Res. 5, 361–397 (2004)
Google Scholar
Xu, J.: Effective and efficient multi-label feature selection approaches via modifying Hilbert-Schmidt independence criterion. In: Hirose, A., Ozawa, S., Doya, K., Ikeda, K., Lee, M., Liu, D. (eds.) ICONIP 2016. LNCS, vol. 9949, pp. 385–395. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46675-0_42
Chapter Google Scholar
Jungjit, S., Freitas, A.A., Michaelis, M., Cinatl, J.: A multi-label correlation based feature selection method for the classification of neuroblastoma microarray data. In: 12th Industrial Conference on Data Mining (ICDM2012): Workshop on Data Mining and Life Sciences (DMLS2012), pp. 149–157 (2012)
Google Scholar
Jungjit, S., Freitas, A.A.: A new genetic algorithm for multi-label correlation-based feature selection. In: 23rd European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning (ESANN2015), pp. 285–290 (2015)
Google Scholar
Lee, J., Kim, D.W.: Memetic feature selection algorithm for multi-label classification. Inf. Sci. 293, 80–95 (2015)
Article Google Scholar
Gretton, A., Bousquet, O., Smola, A., Schölkopf, B.: Measuring statistical dependence with Hilbert-Schmidt norms. In: Jain, S., Simon, H.U., Tomita, E. (eds.) ALT 2005. LNCS (LNAI), vol. 3734, pp. 63–77. Springer, Heidelberg (2005). https://doi.org/10.1007/11564089_7
Chapter Google Scholar
Song, L., Smola, A., Bedo, A.G.J., Borgwardt, K.: Feature selection via dependence maximization. J. Mach. Learn. Res. 13, 1393–1434 (2012)
MathSciNet MATH Google Scholar
Yin, J., Tao, T., Xu, J.: A multi-label feature selection algorithm based on multi -objective optimization. In: 27th IEEE International Joint Conference on Neural Networks (IJCNN2015), pp. 1–7. IEEE Press, New York (2015)
Google Scholar
Scholkopf, B., Smola, A.J.: Learning with Kernels: Support Vectors, Regulization, Optimization and Beyond. MIT Press, Cambridge (2001)
Google Scholar
Holland, J.: Adaptation in Nature and Artificial Systems. MIT Press, Cambridge (1992)
Google Scholar
Oh, I.S., Lee, J.S., Moon, B.R.: Hybrid genetic algorithms for feature selection. IEEE Trans. Pattern Anal. Mach. Intell. 26(11), 1424–1437 (2004)
Article Google Scholar
Zhang, M., Zhou, Z.: Ml-knn: a lazy learning approach to multi-label learning. Pattern Recognit. 40(7), 2038–2048 (2007)
Article Google Scholar
Demsar, J.: Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res. 7, 1–30 (2006)
MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer Science and Technology, Nanjing Normal University, Nanjing, 210023, Jiangsu, China
Chang Liu, Quan Ma & Jianhua Xu

Authors

Chang Liu
View author publications
You can also search for this author in PubMed Google Scholar
Quan Ma
View author publications
You can also search for this author in PubMed Google Scholar
Jianhua Xu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jianhua Xu .

Editor information

Editors and Affiliations

The Chinese Academy of Sciences, Beijing, China
Long Cheng
City University of Hong Kong, Kowloon, Hong Kong
Andrew Chi Sing Leung
Kobe University, Kobe, Japan
Seiichi Ozawa

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Liu, C., Ma, Q., Xu, J. (2018). Multi-label Feature Selection Method Combining Unbiased Hilbert-Schmidt Independence Criterion with Controlled Genetic Algorithm. In: Cheng, L., Leung, A., Ozawa, S. (eds) Neural Information Processing. ICONIP 2018. Lecture Notes in Computer Science(), vol 11304. Springer, Cham. https://doi.org/10.1007/978-3-030-04212-7_1

Download citation

DOI: https://doi.org/10.1007/978-3-030-04212-7_1
Published: 17 November 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-04211-0
Online ISBN: 978-3-030-04212-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics