An evolutionary computation-based approach for feature selection

Abstract

Feature selection plays an important role in the classification process to decrease the computational time, which can reduce the dimensionality of a dataset and improve the accuracy and efficiency of a machine learning task. Feature selection is a process that selects a subset of features based on the optimization criteria. Traditional statistical methods have been ineffective for two reasons, one being to increase the number of observations and the other to increase the number of features associated with an observation. Feature selection methods are a technique to reduce computational time, a better understanding of data, and improve the performance of machine learning and pattern recognition algorithms. The proper definition for solving the feature selection problem is to find a subset of minimum features so that it has the sufficient information for the purpose of problem and to increase the accuracy of the classification algorithm. Several techniques have been proposed to remove irrelevant and redundant features. In this paper, a novel feature selection algorithm that combines genetic algorithms (GA) and particle swarm optimization (PSO) for faster and better search capability is proposed. The hybrid algorithm makes use of the advantages of both PSO and GA methods. In order to evaluate the performance of these approaches, experiments were performed using seven real-world datasets. In this paper the gain ratio index is used to rank the features. The efficiency of the developed hybrid algorithm has been compared with the applicability of the basic algorithms. The results collected over real-world datasets represent the effectiveness of the developed algorithm. The algorithm was examined on seven data sets and the results demonstrate that the presented approach can achieve superior classification accuracy than the other methods.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

References

  1. Anagaw A, Chang YL (2018) A new complement naïve Bayesian approach for biomedical data classification. J Ambient Intell Humaniz Comput. https://doi.org/10.1007/s12652-018-1160-1

    Article  Google Scholar 

  2. Bidgoli AA, Rahnamayan S, Ebrahimpour-Komleh H (2019) Opposition-based multi-objective binary differential evolution for multi-label feature selection. In: International conference on evolutionary multi-criterion optimization. Springer, Cham, pp 553–564. https://doi.org/10.1007/978-3-030-12598-1_44

    Google Scholar 

  3. Bostani H, Sheikhan M (2017) Hybrid of binary gravitational search algorithm and mutual information for feature selection in intrusion detection systems. Soft Comput 21(9):2307–2324

    Article  Google Scholar 

  4. Caruana R,  Freitag D (1994) Greedy attribute selection. In: Machine learning proceedings 1994. Morgan Kaufmann, pp 28–36

  5. Cervante L, Xue B, Zhang M, Shang L (2012) Binary particle swarm optimisation for feature selection: a filter based approach. In: 2012 IEEE congress on evolutionary computation (CEC), pp 1–8. https://doi.org/10.1109/CEC.2012.6256452

  6. Chandrashekar G, Sahin F (2014) A survey on feature selection methods. Comput Electr Eng 40(1):16–28

    Article  Google Scholar 

  7. Chen Y, Miao D, Wang R, Wu K (2011) A rough set approach to feature based on power set tree. Knowl Based Syst. https://doi.org/10.1016/j.knosys.2010.09.004

    Article  Google Scholar 

  8. Dash M, Liu H (1997) Feature selection for classification. Intell Data Anal 1(3):131–156

    Article  Google Scholar 

  9. Duch W, Winiarski T, Biesiada J, Kachel A (2003) Feature selection and ranking filters. In: International conference on artificial neural networks (ICANN) and international conference on neural information processing (ICONIP), vol 251, pp 254–262

  10. Faris H, Mafarja MM, Heidari AA, Aljarah I, Ala’M AZ, Mirjalili S, Fujita H (2018) An efficient binary salp swarm algorithm with crossover scheme for feature selection problems. Knowl Based Syst 154:43–67

    Article  Google Scholar 

  11. Ghadimi N, Akbarimajd A, Shayeghi H, Abedinia O (2018) Two stage forecast engine with feature selection technique and improved meta-heuristic algorithm for electricity load forecasting. Energy 161:130–142

    Article  Google Scholar 

  12. Gheyas IA, Smith LS (2010) Feature subset selection in large dimensionality domains. Pattern Recognit 43(1):5–13

    Article  Google Scholar 

  13. Gu S, Cheng R, Jin Y (2018) Feature selection for high-dimensional classification using a competitive swarm optimizer. Soft Comput 22(3):811–822

    Article  Google Scholar 

  14. Gutlein M, Frank E, Hall M, Karwath A (2009) Large-scale attribute selection using wrappers. In: IEEE Symposium on computational intelligence and data mining, 2009. CIDM’09. pp 332–339

  15. Guyon I, Elisseeff A (2003) An introduction to variable and feature selection. J Mach Learn Res 3(Mar):1157–1182

    MATH  Google Scholar 

  16. Hancer E, Xue B, Karaboga D, Zhang M (2015) A binary ABC algorithm based on advanced similarity scheme for feature selection. Appl Soft Comput 36:334–348

    Article  Google Scholar 

  17. Hancer E, Xue B, Zhang M, Karaboga D, Akay B (2018) Pareto front feature selection based on artificial bee colony optimization. Inf Sci 422:462–479

    Article  Google Scholar 

  18. Jayaraman V, Sultana HP (2019) Artificial gravitational cuckoo search algorithm along with particle bee optimized associative memory neural network for feature selection in heart disease classification. J Ambient Intell Humaniz Comput. https://doi.org/10.1007/s12652-019-01193-6

    Article  Google Scholar 

  19. Kennedy J, Eberhart RC (1995) Particle swarm optimization. In: Proceedings of the IEEE international conference on neural networks, Perth, Australia, vol 4. pp 1942–1948

  20. Kıran MS, Özceylan E, Gündüz M, Paksoy T (2012) A novel hybrid approach based on particle swarm optimization and ant colony algorithm to forecast energy demand of Turkey. Energy Convers Manag 53(1):75–83

    Article  Google Scholar 

  21. Kohavi R, John GH (1997) Wrappers for feature subset selection. Artif intell 97(1–2):273–324

    Article  Google Scholar 

  22. Labani M, Moradi P, Jalili M, Yu X (2017) An evolutionary based multi-objective filter approach for feature selection. In: IEEE 2017 World congress on computing and communication technologies (WCCCT), pp 151–154. https://doi.org/10.1109/WCCCT.2016.44

  23. Lal KN, Chapelle O, Weston J, Elisseeff A (2006) Embedded methods. In: Feature extraction, foundations and applications, studies in fuzziness and soft computing, vol 207, chap 5. Springer, Berlin, p 167. https://doi.org/10.1007/978-3-540-35488-8_6

  24. Lane MC, Xue B, Liu I, Zhang M (2013) Particle swarm optimisation and statistical clustering for feature selection. In: Australasian joint conference on artificial intelligence. Springer, Cham, pp 214–220. https://doi.org/10.1007/978-3-319-03680-9_23

    Google Scholar 

  25. Lane MC, Xue B, Liu I, Zhang M (2014) Gaussian based particle swarm optimisation and statistical clustering for feature selection. In: European conference on evolutionary computation in combinatorial optimization. Springer, Berlin, pp 133–144. https://doi.org/10.1007/978-3-662-44320-0_12

    Google Scholar 

  26. Langley P (1994) Selection of relevant features in machine learning. In: Proceedings of the AAAI fall symposium on relevance, vol 184. pp 245–271

  27. Liu H, Yu L (2005) Toward integrating feature selection algorithms for classification and clustering. IEEE Trans Knowl Data Eng 17(4):491–502

    Article  Google Scholar 

  28. Mafarja M, Mirjalili S (2018) Whale optimization approaches for wrapper feature selection. Appl Soft Comput 62:441–453

    Article  Google Scholar 

  29. Mafarja M, Aljarah I, Heidari AA, Hammouri AI, Faris H, Ala’M AZ, Mirjalili S (2018) Evolutionary population dynamics and grasshopper optimization approaches for feature selection problems. Knowl Based Syst 145:25–45

    Article  Google Scholar 

  30. Malakar S, Ghosh M, Bhowmik S, Sarkar R, Nasipuri M (2019) A GA based hierarchical feature selection approach for handwritten word recognition. Neural Comput Appl 1–20. https://doi.org/10.1007/s00521-018-3937-8

  31. Marill T, Green D (1963) On the effectiveness of receptors in recognition systems. IEEE Trans Inf Theory 9(1):11–17

    Article  Google Scholar 

  32. Moradi P, Gholampour M (2016) A hybrid particle swarm optimization for feature subset selection by integrating a novel local search strategy. Appl Soft Comput 43:117–130

    Article  Google Scholar 

  33. Moslehi F, Haeri A (2019) A novel hybrid wrapper–filter approach based on genetic algorithm, particle swarm optimization for feature subset selection. J Ambient Intell Humaniz Comput. https://doi.org/10.1007/s12652-019-01364-5

    Article  Google Scholar 

  34. Nemati S, Basiri ME, Ghasem-Aghaee N, Aghdam MH (2009) A novel ACO-GA hybrid algorithm for feature selection in protein function prediction. Expert Syst Appl 36(10):12086–12094

    Article  Google Scholar 

  35. Pashaei E, Aydin N (2017) Binary black hole algorithm for feature selection and classification on biological data. Appl Soft Comput 56:94–106

    Article  Google Scholar 

  36. Rao H, Shi X, Rodrigue AK, Feng J, Xia Y, Elhoseny M, Gu L (2019) Feature selection based on artificial bee colony and gradient boosting decision tree. Appl Soft Comput 74:634–642

    Article  Google Scholar 

  37. Sharif M, Tanvir U, Munir EU, Khan MA, Yasmin M (2018) Brain tumor segmentation and classification by improved binomial thresholding and multi-features selection. J Ambient Intell Humaniz Comput. https://doi.org/10.1007/s12652-018-1075-x

    Article  Google Scholar 

  38. Tabakhi S, Moradi P (2015) Relevance–redundancy feature selection based on ant colony optimization. Pattern Recognit 48:2798–2811

    Article  Google Scholar 

  39. Tao Z, Huiling L, Wenwen W, Xia Y (2019) GA-SVM based feature selection and parameter optimization in hospitalization expense modeling. Appl Soft Comput 75:323–332

    Article  Google Scholar 

  40. Unler A, Murat A, Chinnam RB (2011) mr2PSO: a maximum relevance minimum redundancy feature selection method based on swarm intelligence for support vector machine classification. Inf Sci 181(20):4625–4641

    Article  Google Scholar 

  41. Vergara JR, Estévez PA (2014) A review of feature selection methods based on mutual information. Neural Comput Appl 24(1):175–186

    Article  Google Scholar 

  42. Whitney AW (1971) A direct method of nonparametric measurement selection. IEEE Trans Comput 100(9):1100–1103

    Article  Google Scholar 

  43. Xue B (2013) Particle swarm optimisation for feature selection in classification, Victoria University of Wellington, School of Engineering and Computer Science, Ph.D. thesis

  44. Xue B, Zhang M, Browne WN (2012) New fitness functions in binary particle swarm optimisation for feature selection. In: 2012 IEEE congress on evolutionary computation (CEC), pp 1–8. https://doi.org/10.1109/CEC.2012.6256617

  45. Xue B, Zhang M, Browne WN (2013) Particle swarm optimization for feature selection in classification: a multi-objective approach. IEEE Trans Cybern 43(6):1656–1671

    Article  Google Scholar 

  46. Yang W, Li D, Zhu L (2011) An improved genetic algorithm for optimal feature subset selection from multi-character feature set. Expert Syst Appl 38(3):2733–2740

    Article  Google Scholar 

  47. Yang L, Li K, Zhang W, Zheng L, Ke Z, Qi Y (2017) Optimization of classification algorithm based on gene expression programming. J Ambient Intell Humaniz Comput. https://doi.org/10.1007/s12652-017-0563-8

    Article  Google Scholar 

  48. Yao C, Liu YF, Jiang B, Han J, Han J (2017) LLE score: a new filter-based unsupervised feature selection method based on nonlinear manifold embedding and its application to image recognition. IEEE Trans Image Process 26(11):5257–5269

    MathSciNet  Article  Google Scholar 

  49. Zeng XJ, Tao J, Zhang P, Pan H, Wang YY (2011) Reactive power optimization of wind farm based on improved genetic algorithm. In: 2nd international conference on advances in energy engineering, vol 14, pp 1362–1367

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to Abdorrahman Haeri.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Moslehi, F., Haeri, A. An evolutionary computation-based approach for feature selection. J Ambient Intell Human Comput 11, 3757–3769 (2020). https://doi.org/10.1007/s12652-019-01570-1

Download citation

Keywords

  • Feature selection
  • Evolutionary approach
  • Genetic algorithm
  • Particle swarm optimization (PSO)
  • Gain ratio index