In high-dimensional data, many of the features are either irrelevant to the machine learning task or are redundant. These situations lead to two problems, firstly overfitting and secondly high computational overhead. The paper proposes a feature selection method to identify the relevant subset of features for the machine-learning task using wrapper approach. The wrapper approach uses the Binary Bat algorithm to select the set of features and One-pass Generalized Classifier Neural Network (OGCNN) to evaluate the selected set of features using a novel fitness function. The proposed fitness function accounts for the entropy of sensitivity and specificity along with accuracy of classifier and fraction of selected features. The fitness function is compared using four classifiers (Radial Basis Function Neural Network, Probabilistic Neural Network, Extreme Learning Machine and OGCNN) on six publicly available datasets. One-pass classifiers are chosen as these are computationally faster. The results suggest that OGCNN along with the novel fitness function performs well in the majority of cases.
This is a preview of subscription content, log in to check access.
Buy single article
Instant access to the full article PDF.
Price includes VAT for USA
Subscribe to journal
Immediate online access to all issues from 2019. Subscription will auto renew annually.
This is the net price. Taxes to be calculated in checkout.
Altman EI, Marco G, Varetto F (1994) Corporate distress diagnosis: comparisons using linear discriminant analysis and neural networks (the Italian experience). J Bank Finance 18(3):505–529
Arora S, Singh S (2017) An effective hybrid butterfly optimization algorithm with artificial bee colony for numerical optimization. Int J Interact Multimed Artif Intell 26:14–21
Arun V, Krishna M, Arunkumar BV, Padma SK et al (2018) Exploratory boosted feature selection and neural network framework for depression classification. Int J Interact Multimed Artif Intell 5(3):61–71
Babaoglu S, Findik O, Ülker E (2010) A comparison of feature selection models utilizing binary particle swarm optimization and genetic algorithm in determining coronary artery disease using support vector machine. Exp Syst Appl 37(4):3177–3183
Bonabeau Christoph E (2001) Swarm intelligence, vol 79. Morgan Kaufmann Publishers, Burlington
Bourlard H, Morgan N (1993) Continuous speech recognition by connectionist statistical methods. IEEE Trans Neural Netw 4(6):893–909
Chakraborty B, Kawamura A (2018) A new penalty-based wrapper fitness function for feature subset selection with evolutionary algorithms. J Inf Telecommun 2(2):163–180
Chi B-W, Hsu C-C (2012) A hybrid approach to integrate genetic algorithm into dual scoring model in enhancing the performance of credit scoring model. Exp Syst Appl 39(3):2650–2661
Cybenko G (1989) Approximation by superpositions of a sigmoidal function. Math Control Signals Syst 2(4):303–314
da Silva SF, Ribeiro MX, Batista Neto JdE, Traina-Jr C, Traina AJ (2011) Improving the ranking quality of medical image retrieval using a genetic feature selection method. Dec Support Syst 51(4):810–820
De Castro LN, Von Zuben FJ (2005) Recent developments in biologically inspired computing. Idea Group Pub, Hershey
Derrac J, García S, Herrera F (2009) A first study on the use of coevolutionary algorithms for instance and feature selection. In: Hybrid artificial intelligence systems, pp 557–564
Devijver P, Kittler J (1982) Pattern recognition: a statistical approach. Prentice -Hall, Englewood Cliffs, New Jersey
Dua D, Casey G (2017) UCI machine learning repository. http://archive.ics.uci.edu/ml
Edla DR, Tripathi D, Cheruku R, Kuppili V (2018) An efficient multi-layer ensemble framework with BPSOGSA-based feature selection for credit scoring data analysis. Arab J Sci Eng 43(12):6909–6928. https://doi.org/10.1007/s13369-017-2905-4
Espitia HE, Sofrony JI (2018) Statistical analysis for vortex particle swarm optimization. Appl Soft Comput 67:370–386
Guyon I (1991) Applications of neural networks to character recognition. Int J Pattern Recognit Artif Intell 05(02):353–382
Han J, Kamber M, Pei J (2011) Data mining: concepts and techniques. Elsevier, Amsterdam
Hornik K (1991) Approximation capabilities of multilayer feedforward networks. Neural Netw 4(2):251–257
Huang C-L, Dun J-F (2008) A distributed PSOSVM hybrid system with feature selection and parameter optimization. Appl Soft Comput 8(4):1381–1391
Hunt R, Neshatian K, Zhang M (2012) A genetic programming approach to hyper-heuristic feature selection. In: Asia-Pacific conference on simulated evolution and learning SEAL. Springer, Berlin, pp 320–330
Kennedy J, Eberhart R (1997) A discrete binary version of the particle swarm algorithm. In: IEEE international conference on systems, man, and cybernetics. computational cybernetics and simulation, vol 5. IEEE, pp 4104–4108
Kushwaha P, Welekar RR (2016) International journal of interactive multimedia and artificial intelligence. Int J Interact Multimed Artif Intell 4(Regular Issue):16–21
Lin C-M, Hou Y-L, Chen T-Y, Chen K-H (2014) Breast nodules computer-aided diagnostic system design using fuzzy cerebellar model neural networks. IEEE Trans Fuzzy Syst 22(3):693–699
Mafarja M, Jaber I, Eleyan D, Hammouri A, Mirjalili S (2017) Binary dragonfly algorithm for feature selection. In International conference on new trends in computing sciences (ICTCS). IEEE, pp 12–17
Meza J, Espitia H, Montenegro C, Crespo RG (2016) Statistical analysis of a multi-objective optimization algorithm based on a model of particles with vorticity behavior. Soft Comput 20(9):3521–3536
Mirjalili S, Hashim SZM (2012) BMOA: binary magnetic optimization algorithm. Int J Mach Learn Comput 2(2):204–208
Mirjalili S, Mirjalili SM, Yang X-S (2014) Binary bat algorithm. Neural Comput Appl 25(3–4):663–681
Mitchell MC (1998) An introduction to genetic algorithms. MIT Press, Cambridge
Muni D, Pal N, Das J (2006) Genetic programming for simultaneous feature selection and classifier design. IEEE Trans Syst Man Cybern Part B (Cybernetics) 36(1):106–117
Nakamura RYM, Pereira LAM, Rodrigues D, Costa KAP, Papa JP 552, Yang XS (2013) Binary bat algorithm for feature selection. In: Swarm intelligence and bio-inspired computation. Elsevier, pp 225–237
Olariu S, Zomaya AY (2006) Handbook of bioinspired algorithms and applications. Chapman & Hall/CRC, Boca Raton
Oreski S, Oreski G (2014) Genetic algorithm-based heuristic for feature selection in credit risk assessment. Exp Syst Appl 41(4):2052–2064
Ozyildirim BM, Avci M (2013) Generalized classifier neural network. Neural Netw 39:18–26
Ozyildirim BM, Avci M (2016) One pass learning for generalized classifier neural network. Neural Netw 73:70–76
Rashedi E, Nezamabadi-pour H, Saryazdi S (2010) BGSA: binary gravitational search algorithm. Nat Comput 9(3):727–745
Revanasiddappa M, Harish B (2018) A new feature selection method based on intuitionistic fuzzy entropy to categorize text documents. Int J Interact Multimed Artif Intell 5(3):106–117
Saeys Y, Inza I, Larranaga P (2007) A review of feature selection techniques in bioinformatics. Bioinformatics 23(19):2507–2517
Savchenko A (2013) Probabilistic neural network with homogeneity testing in recognition of discrete patterns set. Neural Netw 46:227–241
Souza F, Matias T, Araujo R (2011) Co-evolutionarygenetic multilayer perceptron for feature selection and modeldesign. In: ETFA2011. IEEE, pp 1–7
Unler A, Murat A (2010) A discrete particle swarm optimization method for feature selection in binary classification problems. Eur J Oper Res 206(3):528–539
Winkler SM, Affenzeller M, Jacak W, Stekel H (2011) Identification of cancer diagnosis estimation models using evolutionary algorithms—a case study for breast cancer, melanoma, and cancer in the respiratory system general terms. In: 13th annual conference genetic and evolutionary computation conference (GECCO), number 11. Dublin, Ireland, pp 503–510
Xue B, Zhang M, Browne WN (2013a) Novel initialisation and updating mechanisms in PSO for feature selection in classification. In: European conference on the applications of evolutionary computation. Springer, Berlin, Heidelberg, pp 428–438
Xue B, Zhang M, Browne WN (2013b) Particle swarm optimization for feature selection in classification: a multi-objective approach. IEEE Trans Cybern 43(6):1656–1671
Xue B, Zhang M, Browne WN, Yao X (2016) A survey on evolutionary computation approaches to feature selection. IEEE Trans Evol Comput 20(4):606–626
Yang X-S (2010) A new metaheuristic bat-inspired algorithm. Nature inspired cooperative strategies for optimization (NICSO 2010). Stud Comput Intell 284:65–74
Zang H, Zhang S, Hapeshi K (2010) A review of nature-inspired algorithms. J Bionic Eng 7:232–237
Zawbaa HM, Emary E, Grosan C, Snasel V (2018) Large-dimensionality small-instance set feature selection: a hybrid bio-inspired heuristic approach. Swarm Evol Comput 42:29–42
Zeugmann T, Poupart P, Kennedy J, Jin X, Han J, Saitta L, Sebag M, Peters J, Bagnell JA, Daelemans W, Webb GI, Ting KM, Ting KM, Webb GI, Shirabad JS, Fürnkranz J, Hüllermeier E, Matwin S, Sakakibara Y, Flener P, Schmid U, Procopiuc CM, Lachiche N, Fürnkranz J (2011) Particle swarm optimization. Encyclopedia of machine learning. Springer, Boston, pp 760–766
Zhang G (2000) Neural networks for classification: a survey. IEEE Trans Syst Man Cybern Part C (Appl Rev) 30(4):451–462
Zhang Y, Xia C, Gong D, Sun X (2014) Multi-objective PSO algorithm for feature selection problems with unreliable data. In: International conference in swarm intelligence. Springer, Cham, pp 386–393
Zhao X, Li D, Yang B, Ma C, Zhu Y, Chen H (2014) Feature selection based on improved ant colony optimization for online detection of foreign fiber in cotton. Appl Soft Comput 24:585–596
Zhu Z, Ong Y-S, Dash M (2007) Markov blanket-embedded genetic algorithm for gene selection. Pattern Recognit 40(11):3236–3248
Conflict of interest
The authors declare that they have no conflict of interest.
This article does not contain any studies with human participants or animals performed by any of the authors.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Communicated by V. Loia.
About this article
Cite this article
Naik, A.K., Kuppili, V. & Edla, D.R. Efficient feature selection using one-pass generalized classifier neural network and binary bat algorithm with a novel fitness function. Soft Comput 24, 4575–4587 (2020). https://doi.org/10.1007/s00500-019-04218-6
- Feature selection
- Wrapper approach
- Bio-inspired algorithms
- One-pass neural network