A hybrid Harris Hawks optimization algorithm with simulated annealing for feature selection

Abstract

The significant growth of modern technology and smart systems has left a massive production of big data. Not only are the dimensional problems that face the big data, but there are also other emerging problems such as redundancy, irrelevance, or noise of the features. Therefore, feature selection (FS) has become an urgent need to search for the optimal subset of features. This paper presents a hybrid version of the Harris Hawks Optimization algorithm based on Bitwise operations and Simulated Annealing (HHOBSA) to solve the FS problem for classification purposes using wrapper methods. Two bitwise operations (AND bitwise operation and OR bitwise operation) can randomly transfer the most informative features from the best solution to the others in the populations to raise their qualities. The Simulate Annealing (SA) boosts the performance of the HHOBSA algorithm and helps to flee from the local optima. A standard wrapper method K-nearest neighbors with Euclidean distance metric works as an evaluator for the new solutions. A comparison between HHOBSA and other state-of-the-art algorithms is presented based on 24 standard datasets and 19 artificial datasets and their dimension sizes can reach up to thousands. The artificial datasets help to study the effects of different dimensions of data, noise ratios, and the size of samples on the FS process. We employ several performance measures, including classification accuracy, fitness values, size of selected features, and computational time. We conduct two statistical significance tests of HHOBSA like paired-samples T and Wilcoxon signed ranks. The proposed algorithm presented superior results compared to other algorithms.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21
Fig. 22

References

  1. Abdel-Basset M et al (2019) A new fusion of grey wolf optimizer algorithm with a two-phase mutation for feature selection. Exp Syst Appl 139:112824

    Google Scholar 

  2. Agrawal R, Kaur B, Sharma S (2020) Quantum based whale optimization algorithm for wrapper feature selection. Appl Soft Comput 106092

  3. Ahmed S et al. (2018) Feature selection using salp swarm algorithm with chaos. In: Proceedings of the 2nd international conference on intelligent systems, metaheuristics and swarm intelligence. ACM, Cambridge

  4. Alam MWU (2018) Improved binary bat algorithm for feature selection

  5. Aljarah I et al (2018) Simultaneous feature selection and support vector machine optimization using the grasshopper optimization algorithm. Cogn Comput 10:478–495

    Google Scholar 

  6. Al-Rawashdeh G, Mamat R, Rahim NHBA (2019) Hybrid water cycle optimization algorithm with simulated annealing for spam E-mail detection. IEEE Access 7:143721–143734

    Google Scholar 

  7. Al-Tashi Q et al (2019) Binary optimization using hybrid grey wolf optimization for feature selection. IEEE Access 7:39496–39508

    Google Scholar 

  8. Altman NS (1992) An introduction to kernel and nearest-neighbor nonparametric regression. Am Stat 46(3):175–185

    MathSciNet  Google Scholar 

  9. Arora S, Anand P (2019) Binary butterfly optimization approaches for feature selection. Expert Syst Appl 116:147–160

    Google Scholar 

  10. Arora S et al (2019) A new hybrid algorithm based on grey wolf optimization and crow search algorithm for unconstrained function optimization and feature selection. IEEE Access 7:26343–26361

    Google Scholar 

  11. Attigeri G, Manohara Pai MM (2019) Feature selection using submodular approach for financial big data. J Inf Process Syst 15(6)

  12. Bolón-Canedo V, Sánchez-Maroño N, Alonso-Betanzos A (2013) A review of feature selection methods on synthetic data. Knowl Inf Syst 34(3):483–519

    Google Scholar 

  13. Cai J et al (2018) Feature selection in machine learning: a new perspective. Neurocomputing 300:70–79

    Google Scholar 

  14. Chen K, Zhou F-Y, Yuan X-F (2019) Hybrid particle swarm optimization with spiral-shaped mechanism for feature selection. Expert Syst Appl 128:140–156

    Google Scholar 

  15. Chen H et al (2020) An enhanced bacterial foraging optimization and its application for training kernel extreme learning machine. Appl Soft Comput 86:105884

    Google Scholar 

  16. De Souza RCT et al. (2018) A V-shaped binary crow search algorithm for feature selection. In: 2018 IEEE congress on evolutionary computation (CEC)

  17. El Aziz MA, Hassanien AE (2018) Modified cuckoo search algorithm with rough sets for feature selection. Neural Comput Appl 29(4):925–934

    Google Scholar 

  18. Emary E, Zawbaa HM, Hassanien AE (2016a) Binary ant lion approaches for feature selection. Neurocomputing 213:54–65

    Google Scholar 

  19. Emary E, Zawbaa HM, Hassanien AE (2016b) Binary grey wolf optimization approaches for feature selection. Neurocomputing 172:371–381

    Google Scholar 

  20. Ewees AA, El Aziz MA, Hassanien AE (2019) Chaotic multi-verse optimizer-based feature selection. Neural Comput Appl 31(4):991–1006

    Google Scholar 

  21. Faris H, Aljarah I, Al-Shboul B (2016) A hybrid approach based on particle swarm optimization and random forests for e-mail spam filtering. In: International conference on computational collective intelligence. Springer, Berlin

  22. Faris H et al (2018) An efficient binary salp swarm algorithm with crossover scheme for feature selection problems. Knowl-Based Syst 154:43–67

    Google Scholar 

  23. Gashti MZ (2017) Detection of spam email by combining harmony search algorithm and decision tree. Eng Technol Appl Sci Res 7(3):1713–1718

    Google Scholar 

  24. Guha R et al. (2020) Embedded chaotic whale survival algorithm for filter-wrapper feature selection. arXiv preprint arXiv:2005.04593

  25. Habib M et al. (2020) Multi-objective particle swarm optimization: theory, literature review, and application in feature selection for medical diagnosis. In: Evolutionary machine learning techniques. Springer, Berlin, pp 175–201

  26. Heidari AA et al (2019) Harris Hawks optimization: algorithm and applications. Future Gen Comput Syst 97:849–872

    Google Scholar 

  27. Hussien, A.G., et al., S-shaped binary whale optimization algorithm for feature selection, in Recent trends in signal and image processing. 2019, Springer. p. 79-87

  28. Ibrahim RA et al (2018) Improved salp swarm algorithm based on particle swarm optimization for feature selection. J Ambient Intell Hum Comput 10:3155–3169

    Google Scholar 

  29. Jadhav S, He H, Jenkins K (2018) Information gain directed genetic algorithm wrapper feature selection for credit rating. Appl Soft Comput 69:541–553

    Google Scholar 

  30. Jain I, Jain VK, Jain R (2018) Correlation feature selection based improved-binary particle swarm optimization for gene selection and cancer classification. Appl Soft Comput 62:203–215

    Google Scholar 

  31. Kirkpatrick S, Gelatt CD, Vecchi MP (1983) Optimization by simulated annealing. Science 220(4598):671–680

    MathSciNet  MATH  Google Scholar 

  32. Lichman M (2013) UCI machine learning repository. University of California, Irvine

    Google Scholar 

  33. Liu H, Motoda H (2012) Feature selection for knowledge discovery and data mining, vol 454. Springer, Berlin

    Google Scholar 

  34. Liu J et al (2016) A binary differential search algorithm for the 0–1 multidimensional knapsack problem. Appl Math Model 40(23–24):9788–9805

    MathSciNet  MATH  Google Scholar 

  35. Liu Y, Bi J-W, Fan Z-P (2017) Multi-class sentiment classification: the experimental comparisons of feature selection and machine learning algorithms. Expert Syst Appl 80:323–339

    Google Scholar 

  36. Mafarja M, Mirjalili S (2018a) Whale optimization approaches for wrapper feature selection. Appl Soft Comput 62:441–453

    Google Scholar 

  37. Mafarja MM, Mirjalili S (2018b) Hybrid binary ant lion optimizer with rough set and approximate entropy reducts for feature selection. Soft Comput 1–17

  38. Mafarja M et al (2018a) Feature selection using binary particle swarm optimization with time varying inertia weight strategies. In: Proceedings of the 2nd international conference on future networks and distributed systems. ACM, Cambridge

  39. Mafarja M et al (2018b) Evolutionary population dynamics and grasshopper optimization approaches for feature selection problems. Knowl-Based Syst 145:25–45

    Google Scholar 

  40. Mafarja M et al (2018c) Binary dragonfly optimization for feature selection using time-varying transfer functions. Knowl-Based Syst 161:185–204

    Google Scholar 

  41. Mafarja M et al. (2019a) Whale optimisation algorithm for high-dimensional small-instance feature selection. Int J Parallel Emerg Distrib Syst 1–17

  42. Mafarja M et al (2019b) Binary grasshopper optimisation algorithm approaches for feature selection problems. Expert Syst Appl 117:267–286

    Google Scholar 

  43. Mafarja M et al (2019c) Efficient hybrid nature-inspired binary optimizers for feature selection. Cogn Comput 12:1–26

    Google Scholar 

  44. Majid M et al (2018) A comparative study on the application of binary particle swarm optimization and binary gravitational search algorithm in feature selection for automatic classification of brain tumor MRI. J Fund Appl Sci 10(2S):486–498

    Google Scholar 

  45. Marino S et al (2018) Controlled feature selection and compressive big data analytics: applications to biomedical and health studies. PLoS ONE 13(8):e0202674

    Google Scholar 

  46. Mirjalili S, Lewis A (2013) S-shaped versus V-shaped transfer functions for binary particle swarm optimization. Swarm Evol Comput 9:1–14

    Google Scholar 

  47. Mirjalili S, Mirjalili SM, Yang X-S (2014) Binary bat algorithm. Neural Comput Appl 25(3–4):663–681

    Google Scholar 

  48. Mirjalili S, Mirjalili SM, Hatamlou A (2016) Multi-verse optimizer: a nature-inspired algorithm for global optimization. Neural Comput Appl 27(2):495–513

    Google Scholar 

  49. Nayak B, Mohapatra A, Mohanty KB (2019) Parameter estimation of single diode PV module based on GWO algorithm. Renew Energy Focus 30:1–12

    Google Scholar 

  50. Nematzadeh H et al (2019) Frequency based feature selection method using whale algorithm. Genomics 111:1946–1955

    Google Scholar 

  51. Pashaei E, Aydin N (2017) Binary black hole algorithm for feature selection and classification on biological data. Appl Soft Comput 56:94–106

    Google Scholar 

  52. Pourpanah F et al (2019) Feature selection based on brain storm optimization for data classification. Appl Soft Comput 80:761–775

    Google Scholar 

  53. Rajamohana S, Umamaheswari K (2018) Hybrid approach of improved binary particle swarm optimization and shuffled frog leaping for feature selection. Comput Electr Eng 67:497–508

    Google Scholar 

  54. Rashedi E, Nezamabadi-Pour H, Saryazdi S (2010) BGSA: binary gravitational search algorithm. Nat Comput 9(3):727–745

    MathSciNet  MATH  Google Scholar 

  55. Saidala RK, Devarakonda NR (2017) Bubble-net hunting strategy of whales based optimized feature selection for e-mail classification. In: 2017 2nd international conference for convergence in technology (I2CT)

  56. Sayed GI, Hassanien AE, Azar AT (2019a) Feature selection via a novel chaotic crow search algorithm. Neural Comput Appl 31(1):171–188

    Google Scholar 

  57. Sayed S et al (2019b) A nested genetic algorithm for feature selection in high-dimensional cancer microarray datasets. Expert Syst Appl 121:233–243

    Google Scholar 

  58. Sayed GI, Tharwat A, Hassanien AE (2019c) Chaotic dragonfly algorithm: an improved metaheuristic algorithm for feature selection. Appl Intell 49(1):188–205

    Google Scholar 

  59. Selvakumar B, Muneeswaran K (2019) Firefly algorithm based feature selection for network intrusion detection. Comput Sec 81:148–155

    Google Scholar 

  60. Shen L et al (2016) Evolving support vector machines using fruit fly optimization for medical data classification. Knowl-Based Syst 96:61–75

    Google Scholar 

  61. Shuaib M et al (2019) Whale optimization algorithm-based email spam feature selection method using rotation forest algorithm for classification. SN Appl Sci 1(5):390

    Google Scholar 

  62. Singh M (2019) Classification of spam email using intelligent water drops algorithm with naive Bayes classifier. In: Progress in advanced computing and intelligent engineering. Springer, Berlin. pp 133-138

  63. Singh DAAG et al (2016) Dimensionality reduction using genetic algorithm for improving accuracy in medical diagnosis. Int J Intell Syst Appl 8(1):67

    Google Scholar 

  64. Sun G et al (2018) Feature selection for IoT based on maximal information coefficient. Future Gen Comput Syst 89:606–616

    Google Scholar 

  65. Taradeh M et al (2019) An evolutionary gravitational search-based feature selection. Inf Sci 497:219–239

    Google Scholar 

  66. Thaher T et al. (2020) Binary Harris Hawks optimizer for high-dimensional, low sample size feature selection. In: Evolutionary machine learning techniques. Springer, Berlin, pp 251–272

  67. Too J, Abdullah AR, Mohd Saad N (2019) A new co-evolution binary particle swarm optimization with multiple inertia weight strategy for feature selection. In: Informatics. Multidisciplinary Digital Publishing Institute, Basel

  68. Tu Q, Chen X, Liu X (2019a) Hierarchy strengthened grey wolf optimizer for numerical optimization and feature selection. IEEE Access 7:78012–78028

    Google Scholar 

  69. Tu Q, Chen X, Liu X (2019b) Multi-strategy ensemble grey wolf optimizer and its application to feature selection. Appl Soft Comput 76:16–30

    Google Scholar 

  70. Unler A, Murat A (2010) A discrete particle swarm optimization method for feature selection in binary classification problems. Eur J Oper Res 206(3):528–539

    MATH  Google Scholar 

  71. Wang M, Chen H (2020) Chaotic multi-swarm whale optimizer boosted support vector machine for medical diagnosis. Appl Soft Comput 88:105946

    Google Scholar 

  72. Wang M et al (2017) Toward an optimal kernel extreme learning machine using a chaotic moth-flame optimization strategy with applications in medical diagnoses. Neurocomputing 267:69–84

    Google Scholar 

  73. Wu D et al (2018) A feature-based learning system for Internet of Things applications. IEEE Int Things J 6(2):1928–1937

    Google Scholar 

  74. Xu X, Chen H-L (2014) Adaptive computational chemotaxis based on field in bacterial foraging optimization. Soft Comput 18(4):797–807

    MathSciNet  Google Scholar 

  75. Xu Y et al (2019) Enhanced Moth-flame optimizer with mutation strategy for global optimization. Inf Sci 492:181–203

    MathSciNet  Google Scholar 

  76. Yamada M et al (2018) Ultra high-dimensional nonlinear feature selection for big biological data. IEEE Trans Knowl Data Eng 30(7):1352–1365

    Google Scholar 

  77. Yan C et al (2019) Hybrid binary coral reefs optimization algorithm with simulated annealing for feature selection in high-dimensional biomedical datasets. Chemometr Intell Lab Syst 184:102–111

    Google Scholar 

  78. Yang X-S (2010) Nature-inspired metaheuristic algorithms. Luniver Press, Beckington

    Google Scholar 

  79. Yang X-S (2012) Flower pollination algorithm for global optimization. In: International conference on unconventional computing and natural computation. Springer, Berlin

  80. Zakeri A, Hokmabadi A (2019) Efficient feature selection method using real-valued grasshopper optimization algorithm. Expert Syst Appl 119:61–72

    Google Scholar 

  81. Zhang T et al. (2019) Correlated differential privacy: feature selection in machine learning. IEEE Trans Ind Inf

  82. Zhao L, Dong X (2018) An industrial Internet of Things feature selection method based on potential entropy evaluation criteria. IEEE Access 6:4608–4617

    Google Scholar 

  83. Zhao X et al (2014) Feature selection based on improved ant colony optimization for online detection of foreign fiber in cotton. Appl Soft Comput 24:585–596

    Google Scholar 

  84. Zhao X et al (2019) Chaos enhanced grey wolf optimization wrapped ELM for diagnosis of paraquat-poisoned patients. Comput Biol Chem 78:481–490

    Google Scholar 

  85. Zheng Y et al (2018) A novel hybrid algorithm for feature selection based on whale optimization algorithm. IEEE Access 7:14908–14923

    Google Scholar 

Download references

Funding

This research has no funding source.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Mohamed Abdel-Basset.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Abdel-Basset, M., Ding, W. & El-Shahat, D. A hybrid Harris Hawks optimization algorithm with simulated annealing for feature selection. Artif Intell Rev 54, 593–637 (2021). https://doi.org/10.1007/s10462-020-09860-3

Download citation

Keywords

  • Feature selection
  • Harris Hawks algorithm
  • k-nearest neighbor
  • Classification
  • Data dimensionality