Advertisement

A dividing-based many-objective evolutionary algorithm for large-scale feature selection

  • Haoran Li
  • Fazhi HeEmail author
  • Yaqian Liang
  • Quan Quan
Methodologies and Application
  • 29 Downloads

Abstract

Feature selection is a critical preprocess for constructing model in computer vision and machine learning, yet it is difficult to simultaneously satisfy both reducing features’ number and maintaining classification accuracy. Toward this problem, we propose dividing-based many-objective evolutionary algorithm for large-scale feature selection (DMEA-FS). Firstly, four novel objectives are established for exploring the optimal feature’s subsets. Meanwhile, we design two structures of wrapper for high accuracy and filter for low computation cost in DMEA-FS. Secondly, two new recombination methods are presented for rapid convergence. Mapping-based variable dividing is presented for precise related variables. Thirdly, based on minimum Manhattan distance, a triangle-approximating decision-making is proposed for assisting users’ determination with/without preference information. Numerical experiments against several state-of-the-art feature selection algorithms demonstrate that the proposed DMEA-FS outperforms its competitors in terms of both classification accuracy and metrics of features’ number.

Keywords

Feature selection MaOEAs Decision-making 

Notes

Acknowledgements

This study was funded by X the National Science Foundation of China (Grant No. 61472289).

Compliance with ethical standards

Conflict of interest

Haoran Li declares that he has no conflict of interest. Fazhi He declares that he has no conflict of interest. Yaqian Laing declares that she has no conflict of interest. Quan Quan declares that he has no conflict of interest.

Ethical approval

This article does not contain any studies with human participants performed by any of the authors.

References

  1. Abualigah LMQ (2019) Feature selection and enhanced krill herd algorithm for text document clustering. Springer, ChamGoogle Scholar
  2. Abualigah LMQ, Hanandeh ES (2015) Applying genetic algorithms to information retrieval using vector space model. Int J Comput Sci Eng Appl 5:19Google Scholar
  3. Abualigah LM, Khader AT (2017) Unsupervised text feature selection technique based on hybrid particle swarm optimization algorithm with genetic operators for the text clustering. J Supercomput 73:4773–4795Google Scholar
  4. Abualigah LM, Khader AT, Hanandeh ES, Gandomi AH (2017) A novel hybridization strategy for krill herd algorithm applied to clustering techniques. Appl Soft Comput 60:423–435Google Scholar
  5. Abualigah LM, Khader AT, Hanandeh ES (2018a) Hybrid clustering analysis using improved krill herd algorithm. Appl Intell 48:4047–4071Google Scholar
  6. Abualigah LM, Khader AT, Hanandeh ES (2018b) A combination of objective functions and hybrid krill herd algorithm for text document clustering analysis. Eng Appl Artif Intell 73:111–125Google Scholar
  7. Abualigah LM, Khader AT, Hanandeh ES (2018c) A new feature selection method to improve the document clustering using particle swarm optimization algorithm. J Comput Sci 25:456–466Google Scholar
  8. Adler J, Parmryd I (2010) Quantifying colocalization by correlation: the Pearson correlation coefficient is superior to the Mander’s overlap coefficient. Cytometry A 77A:733–742Google Scholar
  9. Alomari OA, Khader AT, Al-Betar MA, Abualigah LM (2017a) Mrmr ba: a hybrid gene selection algorithm for cancer classification. J Theor Appl Inf Technol 95:2610–2618Google Scholar
  10. Alomari OA, Khader AT, Al-Betar MA, Abualigah LM (2017b) Gene selection for cancer classification by combining minimum redundancy maximum relevancy and bat-inspired algorithm. Int J Data Min Bioinform 19:32–51Google Scholar
  11. Chen X, He F, Yu H (2019) A matting method based on full feature coverage. Multimed Tools Appl 78:11173–11201Google Scholar
  12. Chuang LY, Chang HW, Tu CJ, Yang CH (2008) Improved binary pso for feature selection using gene expression data. Comput Biol Chem 32:29–38zbMATHGoogle Scholar
  13. Das S, Abraham A, Chakraborty UK, Konar A (2009) Differential evolution using a neighborhood-based mutation operator. IEEE Trans Evol Comput 13:526–553Google Scholar
  14. Deb K, Beyer HG (2001) Self-adaptive genetic algorithms with simulated binary crossover. Evol Comput 9:197–221Google Scholar
  15. Deb K, Jain H (2014) An evolutionary many-objective optimization algorithm using reference-point-based nondominated sorting approach, part I: solving problems with box constraints. IEEE Trans Evol Comput 18:577–601Google Scholar
  16. Dua D, Graff C (2017) UCI machine learning repositoryGoogle Scholar
  17. Duro JA, Saxena DK, Deb K, Zhang Q (2014) Machine learning based decision support for many-objective optimization problems. Neurocomputing 146:30–47Google Scholar
  18. Gu S, Cheng R, Jin Y (2018) Feature selection for high-dimensional classification using a competitive swarm optimizer. Soft Comput 22:811–822Google Scholar
  19. Guo X, Wang X, Wang M, Wang Y (2012) A new objective reduction algorithm for many-objective problems: employing mutual information and clustering algorithm. In: 2012 Eighth international conference on computational intelligence and security, IEEE, pp 11–16Google Scholar
  20. Hadka D, Reed P (2013) Borg: an auto-adaptive many-objective evolutionary computing framework. Evol Comput 21:231Google Scholar
  21. Hamdani TM, Won J.-M, Alimi AM, Karray F (2007) Multi-objective feature selection with NSGA II. In: International conference on adaptive and natural computing algorithms, Springer, pp 240–247Google Scholar
  22. Hancer E, Bing X, Karaboga D, Zhang M (2015) A binary abc algorithm based on advanced similarity scheme for feature selection. Appl Soft Comput 36:334–348Google Scholar
  23. Harris RS, Longerich S, Rosenberg SM (1994) Recombination in adaptive mutation. Science 264:258–260Google Scholar
  24. Hou N, He F, Zhou Y, Chen Y (2019) An efficient gpu-based parallel tabu search algorithm for hardware/software co-design. Front Comput Sci.  https://doi.org/10.1007/s11704-019-8184-3 Google Scholar
  25. Huang CL (2009) Aco-based hybrid classification system with feature subset selection and model parameters optimization. Neurocomputing 73:438–448Google Scholar
  26. Huang CL, Wang CJ (2006) A ga-based feature selection and parameters optimizationfor support vector machines. Expert Syst Appl 31:231–240Google Scholar
  27. Huang GB, Zhu QY, Siew CK (2006) Extreme learning machine: theory and applications. Neurocomputing 70:489–501Google Scholar
  28. Huang J, Cai Y, Xu X (2007) A hybrid genetic algorithm for feature selection wrapper based on mutual information. Pattern Recogn Lett 28:1825–1844Google Scholar
  29. Ishibuchi H, Doi K, Nojima Y (2016) Reference point specification in MOEA/D for multi-objective and many-objective problems. In: 2016 IEEE international conference on systems, man, and cybernetics (SMC), IEEE, pp 004015–004020Google Scholar
  30. Ishibuchi H, Doi K, Nojima Y (2017) On the effect of normalization in moea/d for multi-objective and many-objective optimization. Complex Intell Syst 3:279–294Google Scholar
  31. Ishibuchi H, Tsukamoto N, Nojima Y (2008) Evolutionary many-objective optimization: a short review. In: 2008 IEEE congress on evolutionary computation (IEEE world congress on computational intelligence), IEEE, pp 2419–2426Google Scholar
  32. Jain H, Deb K (2014) An evolutionary many-objective optimization algorithm using reference-point based nondominated sorting approach, part II: handling constraints and extending to an adaptive approach. IEEE Trans Evol Comput 18:602–622Google Scholar
  33. Kale A, Sonavane S (2017) Hybrid feature subset selection approach for fuzzy-extreme learning machine. Data-Enabled Discov Appl 1:10Google Scholar
  34. Karakaya G, Galelli S, Ahipasaoglu SD, Taormina R (2016) Identifying (Quasi) equally informative subsets in feature selection problems for classification: a max-relevance min-redundancy approach. IEEE Trans Cybern 46:1424–1437Google Scholar
  35. Kent JT (1983) Information gain and a general measure of correlation. Biometrika 70:163–173MathSciNetzbMATHGoogle Scholar
  36. Komeili M, Louis W, Armanfard N, Hatzinakos D (2018) Feature selection for nonstationary data: application to human recognition using medical biometrics. IEEE Trans Cybern 48:1446–1459Google Scholar
  37. Li K, He FZ, Yu HP (2018) Robust visual tracking based on convolutional features with illumination and occlusion handing. J Comput Sci Technol 33:223–236Google Scholar
  38. Li K, He F, Yu H, Chen X (2019a) A parallel and robust object tracking approach synthesizing adaptive bayesian learning and improved incremental subspace learning. Front Comput Sci 13:1116–1135Google Scholar
  39. Li H, He F, Yan X (2019b) IBEA-SVM: an indicator-based evolutionary algorithm based on pre-selection with classification guided by SVM. Appl Math A J Chin Univ 34:1–26MathSciNetzbMATHGoogle Scholar
  40. Liagkouras K, Metaxiotis K (2013) An elitist polynomial mutation operator for improved performance of MOEAs in computer networks. In: 2013 22nd international conference on computer communication and networks (ICCCN), IEEE, pp 1–5Google Scholar
  41. Liang Y, He F, Li H (2019) An asymmetric and optimized encryption method to protect the confidentiality of 3D mesh model. Adv Eng Inform 42:100–103Google Scholar
  42. Lin S, Tseng T, Chen S, Huang J (2006) A SA-based feature selection and parameter optimization approach for support vector machine. In: 2006 IEEE international conference on systems, man and cybernetics, vol 4, IEEE, pp 3144–3145Google Scholar
  43. Lu D, Weng Q (2007) A survey of image classification methods and techniques for improving classification performance. Int J Remote Sens 28:823–870Google Scholar
  44. Luo J, He F, Yong J (2020) An efficient and robust bat algorithm with fusion of opposition based learning and whale optimization algorithm. Intell Data Anal 3:1291–1308Google Scholar
  45. Lv X, He F, Cai W, Cheng Y (2019) An optimized RGA supporting selective undo for collaborative text editing systems. J Parallel Distrib Comput 132:310–330Google Scholar
  46. Ma B, Yong X (2017) A tribe competition-based genetic algorithm for feature selection in pattern classification. Appl Soft Comput 58:328–338Google Scholar
  47. Narendra Fukunaga (1977) A branch and bound algorithm for feature subset selection. IEEE Trans Comput 26:917–922zbMATHGoogle Scholar
  48. Neng H, Yan X, He F (2019) A survey on partitioning models, solution algorithms and algorithm parallelization for hardware/software co-design. Des Autom Embed Syst 23:57–77Google Scholar
  49. Pan L, He C, Tian Y, Wang H, Zhang X, Jin Y (2019a) A classification-based surrogate-assisted evolutionary algorithm for expensive many-objective optimization. IEEE Trans Evol Comput 23:74–88Google Scholar
  50. Pan Y, He F, Yu H, Li H (2019b) Learning adaptive trust strength with user roles of truster and trustee for trust-aware recommender systems. Appl Intell.  https://doi.org/10.1007/s10489-019-01542-0 Google Scholar
  51. Pan Y, He F, Yu H (2019c) A correlative denoising autoencoder to model social influence for top-n recommender system. Front Comput Sci.  https://doi.org/10.1007/s11704-019-8123-3 Google Scholar
  52. Peng H, Long F, Ding C (2005) Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans Pattern Anal Mach Intell 27:1226–1238Google Scholar
  53. Saha S, Kaur M (2018) Identification of topology-preserving, class-relevant feature subsets using multiobjective optimization. Soft Comput 23:4717–4733Google Scholar
  54. Thangavel K, Manavalan R (2014) Soft computing models based feature selection for trus prostate cancer image classification. Soft Comput 18:1165–1176Google Scholar
  55. Tian D (2016) A multi-objective genetic local search algorithm for optimal feature subset selection. In: 2016 International conference on computational science and computational intelligence (CSCI), IEEE, pp 1089–1094Google Scholar
  56. Wang D, Tan D, Lei L (2018) Particle swarm optimization algorithm: an overview. Soft Comput 22:387–408Google Scholar
  57. Wan M, Yang G, Sun C, Liu M (2019) Sparse two-dimensional discriminant locality-preserving projection (S2DDLPP) for feature extraction. Soft Comput 23:5511–5518 Google Scholar
  58. Wu Y, He F, Zhang D, Li X (2018) Service-oriented feature-based data exchange for cloud-based design and manufacturing. IEEE Trans Serv Comput 11:341–353Google Scholar
  59. Wuerl Adam, Crain Tim, Braden Ellen (2003) Genetic algorithm and calculus of variations-based trajectory optimization technique. J Spacecr Rockets 40:882–888Google Scholar
  60. Yan Q, Long Y, Chao L, Liu H, Hu R, Xiao C (2016) Geometrically based linear iterative clustering for quantitative feature correspondence. Comput Graph Forum 35:1–10Google Scholar
  61. Yan X, He F, Hou N, Ai H (2018) An efficient particle swarm optimization for large-scale hardware/software co-design system. Int J Coop Inf Syst 27:1741001Google Scholar
  62. Yang X, Wei W, Kai L, Chen W, Zhou Z (2018) Multiple dictionary pairs learning and sparse representation-based infrared image super-resolution with improved fuzzy clustering. Soft Comput 22:1385–1398Google Scholar
  63. Yi Y, Qiao S, Wei Z, Zheng C, Liu Q, Wang J (2018) Adaptive multiple graph regularized semi-supervised extreme learning machine. Soft Comput 22:1–18zbMATHGoogle Scholar
  64. Yong J, He F, Li H, Zhou W (2019) A novel bat algorithm based on cross boundary learning and uniform explosion strategy. Appl Math A J Chin Univ.  https://doi.org/10.1007/s11766-019-3714-1 Google Scholar
  65. Yu H, He F, Pan Y (2019) A novel segmentation model for medical images with intensity inhomogeneity based on adaptive perturbation. Multimed Tools Appl 78:11779–11798Google Scholar
  66. Zhang Q, Hui L (2007) MOEA/D: a multiobjective evolutionary algorithm based on decomposition. IEEE Trans Evol Comput 11:712–731Google Scholar
  67. Zhang D, He F, Han S, Li X (2016) Quantitative optimization of interoperability during feature-based data exchange. Integr Comput Aided Eng 23:31–51Google Scholar
  68. Zhang L, Yan Q, Liu Z, Zou H, Xiao C (2017) Illumination decomposition for photograph with multiple light sources. IEEE Trans Image Process 26:4114–4127MathSciNetzbMATHGoogle Scholar
  69. Zhang X, Tian Y, Cheng R, Jin Y (2018) A decision variable clustering based evolutionary algorithm for large-scale many-objective optimization. IEEE Trans Evol Comput 22:97–112Google Scholar
  70. Zhang S, He F, Ren W, Yao J (2019) Joint learning of image detail and transmission map for single image dehazing. Vis Comput.  https://doi.org/10.1007/s00371-018-1612-9 Google Scholar
  71. Zhao H, Sinha AP, Wei G (2009) Effects of feature construction on classification performance: an empirical study in bank failure prediction. Expert Syst Appl 36:2633–2644Google Scholar
  72. Zhou Y, Fazhi HE, Qiu Y (2017) Dynamic strategy based parallel ant colony optimization on gpus for tsps. Sci China 60:068102Google Scholar
  73. Zhou Y, He F, Hou N, Qiu Y (2018) Parallel ant colony optimization on multi-core SIMD CPUS. Future Gener Comput Syst 79:473–487Google Scholar

Copyright information

© Springer-Verlag GmbH Germany, part of Springer Nature 2019

Authors and Affiliations

  1. 1.School of Computer ScienceWuhan UniversityWuhanChina

Personalised recommendations