Abstract
The significant growth of modern technology and smart systems has left a massive production of big data. Not only are the dimensional problems that face the big data, but there are also other emerging problems such as redundancy, irrelevance, or noise of the features. Therefore, feature selection (FS) has become an urgent need to search for the optimal subset of features. This paper presents a hybrid version of the Harris Hawks Optimization algorithm based on Bitwise operations and Simulated Annealing (HHOBSA) to solve the FS problem for classification purposes using wrapper methods. Two bitwise operations (AND bitwise operation and OR bitwise operation) can randomly transfer the most informative features from the best solution to the others in the populations to raise their qualities. The Simulate Annealing (SA) boosts the performance of the HHOBSA algorithm and helps to flee from the local optima. A standard wrapper method K-nearest neighbors with Euclidean distance metric works as an evaluator for the new solutions. A comparison between HHOBSA and other state-of-the-art algorithms is presented based on 24 standard datasets and 19 artificial datasets and their dimension sizes can reach up to thousands. The artificial datasets help to study the effects of different dimensions of data, noise ratios, and the size of samples on the FS process. We employ several performance measures, including classification accuracy, fitness values, size of selected features, and computational time. We conduct two statistical significance tests of HHOBSA like paired-samples T and Wilcoxon signed ranks. The proposed algorithm presented superior results compared to other algorithms.
Similar content being viewed by others
References
Abdel-Basset M et al (2019) A new fusion of grey wolf optimizer algorithm with a two-phase mutation for feature selection. Exp Syst Appl 139:112824
Agrawal R, Kaur B, Sharma S (2020) Quantum based whale optimization algorithm for wrapper feature selection. Appl Soft Comput 106092
Ahmed S et al. (2018) Feature selection using salp swarm algorithm with chaos. In: Proceedings of the 2nd international conference on intelligent systems, metaheuristics and swarm intelligence. ACM, Cambridge
Alam MWU (2018) Improved binary bat algorithm for feature selection
Aljarah I et al (2018) Simultaneous feature selection and support vector machine optimization using the grasshopper optimization algorithm. Cogn Comput 10:478–495
Al-Rawashdeh G, Mamat R, Rahim NHBA (2019) Hybrid water cycle optimization algorithm with simulated annealing for spam E-mail detection. IEEE Access 7:143721–143734
Al-Tashi Q et al (2019) Binary optimization using hybrid grey wolf optimization for feature selection. IEEE Access 7:39496–39508
Altman NS (1992) An introduction to kernel and nearest-neighbor nonparametric regression. Am Stat 46(3):175–185
Arora S, Anand P (2019) Binary butterfly optimization approaches for feature selection. Expert Syst Appl 116:147–160
Arora S et al (2019) A new hybrid algorithm based on grey wolf optimization and crow search algorithm for unconstrained function optimization and feature selection. IEEE Access 7:26343–26361
Attigeri G, Manohara Pai MM (2019) Feature selection using submodular approach for financial big data. J Inf Process Syst 15(6)
Bolón-Canedo V, Sánchez-Maroño N, Alonso-Betanzos A (2013) A review of feature selection methods on synthetic data. Knowl Inf Syst 34(3):483–519
Cai J et al (2018) Feature selection in machine learning: a new perspective. Neurocomputing 300:70–79
Chen K, Zhou F-Y, Yuan X-F (2019) Hybrid particle swarm optimization with spiral-shaped mechanism for feature selection. Expert Syst Appl 128:140–156
Chen H et al (2020) An enhanced bacterial foraging optimization and its application for training kernel extreme learning machine. Appl Soft Comput 86:105884
De Souza RCT et al. (2018) A V-shaped binary crow search algorithm for feature selection. In: 2018 IEEE congress on evolutionary computation (CEC)
El Aziz MA, Hassanien AE (2018) Modified cuckoo search algorithm with rough sets for feature selection. Neural Comput Appl 29(4):925–934
Emary E, Zawbaa HM, Hassanien AE (2016a) Binary ant lion approaches for feature selection. Neurocomputing 213:54–65
Emary E, Zawbaa HM, Hassanien AE (2016b) Binary grey wolf optimization approaches for feature selection. Neurocomputing 172:371–381
Ewees AA, El Aziz MA, Hassanien AE (2019) Chaotic multi-verse optimizer-based feature selection. Neural Comput Appl 31(4):991–1006
Faris H, Aljarah I, Al-Shboul B (2016) A hybrid approach based on particle swarm optimization and random forests for e-mail spam filtering. In: International conference on computational collective intelligence. Springer, Berlin
Faris H et al (2018) An efficient binary salp swarm algorithm with crossover scheme for feature selection problems. Knowl-Based Syst 154:43–67
Gashti MZ (2017) Detection of spam email by combining harmony search algorithm and decision tree. Eng Technol Appl Sci Res 7(3):1713–1718
Guha R et al. (2020) Embedded chaotic whale survival algorithm for filter-wrapper feature selection. arXiv preprint arXiv:2005.04593
Habib M et al. (2020) Multi-objective particle swarm optimization: theory, literature review, and application in feature selection for medical diagnosis. In: Evolutionary machine learning techniques. Springer, Berlin, pp 175–201
Heidari AA et al (2019) Harris Hawks optimization: algorithm and applications. Future Gen Comput Syst 97:849–872
Hussien, A.G., et al., S-shaped binary whale optimization algorithm for feature selection, in Recent trends in signal and image processing. 2019, Springer. p. 79-87
Ibrahim RA et al (2018) Improved salp swarm algorithm based on particle swarm optimization for feature selection. J Ambient Intell Hum Comput 10:3155–3169
Jadhav S, He H, Jenkins K (2018) Information gain directed genetic algorithm wrapper feature selection for credit rating. Appl Soft Comput 69:541–553
Jain I, Jain VK, Jain R (2018) Correlation feature selection based improved-binary particle swarm optimization for gene selection and cancer classification. Appl Soft Comput 62:203–215
Kirkpatrick S, Gelatt CD, Vecchi MP (1983) Optimization by simulated annealing. Science 220(4598):671–680
Lichman M (2013) UCI machine learning repository. University of California, Irvine
Liu H, Motoda H (2012) Feature selection for knowledge discovery and data mining, vol 454. Springer, Berlin
Liu J et al (2016) A binary differential search algorithm for the 0–1 multidimensional knapsack problem. Appl Math Model 40(23–24):9788–9805
Liu Y, Bi J-W, Fan Z-P (2017) Multi-class sentiment classification: the experimental comparisons of feature selection and machine learning algorithms. Expert Syst Appl 80:323–339
Mafarja M, Mirjalili S (2018a) Whale optimization approaches for wrapper feature selection. Appl Soft Comput 62:441–453
Mafarja MM, Mirjalili S (2018b) Hybrid binary ant lion optimizer with rough set and approximate entropy reducts for feature selection. Soft Comput 1–17
Mafarja M et al (2018a) Feature selection using binary particle swarm optimization with time varying inertia weight strategies. In: Proceedings of the 2nd international conference on future networks and distributed systems. ACM, Cambridge
Mafarja M et al (2018b) Evolutionary population dynamics and grasshopper optimization approaches for feature selection problems. Knowl-Based Syst 145:25–45
Mafarja M et al (2018c) Binary dragonfly optimization for feature selection using time-varying transfer functions. Knowl-Based Syst 161:185–204
Mafarja M et al. (2019a) Whale optimisation algorithm for high-dimensional small-instance feature selection. Int J Parallel Emerg Distrib Syst 1–17
Mafarja M et al (2019b) Binary grasshopper optimisation algorithm approaches for feature selection problems. Expert Syst Appl 117:267–286
Mafarja M et al (2019c) Efficient hybrid nature-inspired binary optimizers for feature selection. Cogn Comput 12:1–26
Majid M et al (2018) A comparative study on the application of binary particle swarm optimization and binary gravitational search algorithm in feature selection for automatic classification of brain tumor MRI. J Fund Appl Sci 10(2S):486–498
Marino S et al (2018) Controlled feature selection and compressive big data analytics: applications to biomedical and health studies. PLoS ONE 13(8):e0202674
Mirjalili S, Lewis A (2013) S-shaped versus V-shaped transfer functions for binary particle swarm optimization. Swarm Evol Comput 9:1–14
Mirjalili S, Mirjalili SM, Yang X-S (2014) Binary bat algorithm. Neural Comput Appl 25(3–4):663–681
Mirjalili S, Mirjalili SM, Hatamlou A (2016) Multi-verse optimizer: a nature-inspired algorithm for global optimization. Neural Comput Appl 27(2):495–513
Nayak B, Mohapatra A, Mohanty KB (2019) Parameter estimation of single diode PV module based on GWO algorithm. Renew Energy Focus 30:1–12
Nematzadeh H et al (2019) Frequency based feature selection method using whale algorithm. Genomics 111:1946–1955
Pashaei E, Aydin N (2017) Binary black hole algorithm for feature selection and classification on biological data. Appl Soft Comput 56:94–106
Pourpanah F et al (2019) Feature selection based on brain storm optimization for data classification. Appl Soft Comput 80:761–775
Rajamohana S, Umamaheswari K (2018) Hybrid approach of improved binary particle swarm optimization and shuffled frog leaping for feature selection. Comput Electr Eng 67:497–508
Rashedi E, Nezamabadi-Pour H, Saryazdi S (2010) BGSA: binary gravitational search algorithm. Nat Comput 9(3):727–745
Saidala RK, Devarakonda NR (2017) Bubble-net hunting strategy of whales based optimized feature selection for e-mail classification. In: 2017 2nd international conference for convergence in technology (I2CT)
Sayed GI, Hassanien AE, Azar AT (2019a) Feature selection via a novel chaotic crow search algorithm. Neural Comput Appl 31(1):171–188
Sayed S et al (2019b) A nested genetic algorithm for feature selection in high-dimensional cancer microarray datasets. Expert Syst Appl 121:233–243
Sayed GI, Tharwat A, Hassanien AE (2019c) Chaotic dragonfly algorithm: an improved metaheuristic algorithm for feature selection. Appl Intell 49(1):188–205
Selvakumar B, Muneeswaran K (2019) Firefly algorithm based feature selection for network intrusion detection. Comput Sec 81:148–155
Shen L et al (2016) Evolving support vector machines using fruit fly optimization for medical data classification. Knowl-Based Syst 96:61–75
Shuaib M et al (2019) Whale optimization algorithm-based email spam feature selection method using rotation forest algorithm for classification. SN Appl Sci 1(5):390
Singh M (2019) Classification of spam email using intelligent water drops algorithm with naive Bayes classifier. In: Progress in advanced computing and intelligent engineering. Springer, Berlin. pp 133-138
Singh DAAG et al (2016) Dimensionality reduction using genetic algorithm for improving accuracy in medical diagnosis. Int J Intell Syst Appl 8(1):67
Sun G et al (2018) Feature selection for IoT based on maximal information coefficient. Future Gen Comput Syst 89:606–616
Taradeh M et al (2019) An evolutionary gravitational search-based feature selection. Inf Sci 497:219–239
Thaher T et al. (2020) Binary Harris Hawks optimizer for high-dimensional, low sample size feature selection. In: Evolutionary machine learning techniques. Springer, Berlin, pp 251–272
Too J, Abdullah AR, Mohd Saad N (2019) A new co-evolution binary particle swarm optimization with multiple inertia weight strategy for feature selection. In: Informatics. Multidisciplinary Digital Publishing Institute, Basel
Tu Q, Chen X, Liu X (2019a) Hierarchy strengthened grey wolf optimizer for numerical optimization and feature selection. IEEE Access 7:78012–78028
Tu Q, Chen X, Liu X (2019b) Multi-strategy ensemble grey wolf optimizer and its application to feature selection. Appl Soft Comput 76:16–30
Unler A, Murat A (2010) A discrete particle swarm optimization method for feature selection in binary classification problems. Eur J Oper Res 206(3):528–539
Wang M, Chen H (2020) Chaotic multi-swarm whale optimizer boosted support vector machine for medical diagnosis. Appl Soft Comput 88:105946
Wang M et al (2017) Toward an optimal kernel extreme learning machine using a chaotic moth-flame optimization strategy with applications in medical diagnoses. Neurocomputing 267:69–84
Wu D et al (2018) A feature-based learning system for Internet of Things applications. IEEE Int Things J 6(2):1928–1937
Xu X, Chen H-L (2014) Adaptive computational chemotaxis based on field in bacterial foraging optimization. Soft Comput 18(4):797–807
Xu Y et al (2019) Enhanced Moth-flame optimizer with mutation strategy for global optimization. Inf Sci 492:181–203
Yamada M et al (2018) Ultra high-dimensional nonlinear feature selection for big biological data. IEEE Trans Knowl Data Eng 30(7):1352–1365
Yan C et al (2019) Hybrid binary coral reefs optimization algorithm with simulated annealing for feature selection in high-dimensional biomedical datasets. Chemometr Intell Lab Syst 184:102–111
Yang X-S (2010) Nature-inspired metaheuristic algorithms. Luniver Press, Beckington
Yang X-S (2012) Flower pollination algorithm for global optimization. In: International conference on unconventional computing and natural computation. Springer, Berlin
Zakeri A, Hokmabadi A (2019) Efficient feature selection method using real-valued grasshopper optimization algorithm. Expert Syst Appl 119:61–72
Zhang T et al. (2019) Correlated differential privacy: feature selection in machine learning. IEEE Trans Ind Inf
Zhao L, Dong X (2018) An industrial Internet of Things feature selection method based on potential entropy evaluation criteria. IEEE Access 6:4608–4617
Zhao X et al (2014) Feature selection based on improved ant colony optimization for online detection of foreign fiber in cotton. Appl Soft Comput 24:585–596
Zhao X et al (2019) Chaos enhanced grey wolf optimization wrapped ELM for diagnosis of paraquat-poisoned patients. Comput Biol Chem 78:481–490
Zheng Y et al (2018) A novel hybrid algorithm for feature selection based on whale optimization algorithm. IEEE Access 7:14908–14923
Funding
This research has no funding source.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Ethical approval
This article does not contain any studies with human participants or animals performed by any of the authors.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Abdel-Basset, M., Ding, W. & El-Shahat, D. A hybrid Harris Hawks optimization algorithm with simulated annealing for feature selection. Artif Intell Rev 54, 593–637 (2021). https://doi.org/10.1007/s10462-020-09860-3
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10462-020-09860-3