Abstract
Knowledge discovery in databases is a comprehensive procedure which enables researchers to explore knowledge and information from raw sample data usefully. Some problems may arise during this procedure, for example the Curse of Dimensionality, where the reduction of database is desired to avoid feature redundancy or irrelevancy. In this paper, we propose a wrapper-based feature selection algorithm, consisting of an artificial neural network and self-adaptive differential evolution optimization algorithm. We test performance of the feature selection algorithm on a case study of bank marketing and show that this feature selection algorithm reduces the size of the database and simultaneously improves prediction performance on the observed problem.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Aggarwal, C.C.: Data Classification: Algorithms and Applications. CRC Press, Boca Raton (2014)
Al-Tashi, Q., Kadir, S.J.A., Rais, H.M., Mirjalili, S., Alhussian, H.: Binary optimization using hybrid grey wolf optimization for feature selection. IEEE Access 7, 39496–39508 (2019)
Apolloni, J., Leguizamón, G., Alba, E.: Two hybrid wrapper-filter feature selection algorithms applied to high-dimensional microarray experiments. Appl. Soft Comput. 38, 922–932 (2016)
Bates, D.W., Saria, S., Ohno-Machado, L., Shah, A., Escobar, G.: Big data in health care: using analytics to identify and manage high-risk and high-cost patients. Health Aff. 33(7), 1123–1131 (2014)
Bellman, R.: Adaptive Control Processes: A Guided Tour Princeton University Press. Princeton, New Jersey (1961)
Brest, J., Greiner, S., Boškovič, B., Mernik, M., Žumer, V.: Self-adapting control parameters in differential evolution: a comparative study on numerical benchmark problems. IEEE Trans. Evol. Comput. 10(6), 646–657 (2006)
Cardona, L., Moreno, L.A.: Cash management cost reduction using data mining to forecast cash demand and LP to optimize resources. Memetic Comput. 4(2), 127–134 (2012)
Chandrashekar, G., Sahin, F.: A survey on feature selection methods. Comput. Electr. Eng. 40(1), 16–28 (2014)
Chollet, F., et al.: Keras (2015). https://keras.io
Da Cunha, C., Agard, B., Kusiak, A.: Selection of modules for mass customisation. Int. J. Prod. Res. 48(5), 1439–1454 (2010)
Elsalamony, H.A., Elsayad, A.M.: Bank direct marketing based on neural network. Int. J. Eng. Adv. Technol. (IJEAT) 2(6) (2013)
Ertel, W.: Introduction to Artificial Intelligence. Springer, Heidelberg (2018)
Fayyad, U., Piatetsky-Shapiro, G., Smyth, P.: From data mining to knowledge discovery in databases. AI Mag. 17(3), 37–37 (1996)
Fister, D., Fister, I., Jagrič, T., Fister Jr, I., Brest, J.: A novel self-adaptive differential evolution for feature selection using threshold mechanism. In: 2018 IEEE Symposium Series on Computational Intelligence (SSCI), pp. 17–24. IEEE (2018)
García, S., Luengo, J., Herrera, F.: Data Preprocessing in Data Mining. Springer, New York (2015). https://doi.org/10.1007/978-3-319-10247-4
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Kohavi, R., John, G.H.: Wrappers for feature subset selection. Artif. Intell. 97(1–2), 273–324 (1997)
Labani, M., Moradi, P., Ahmadizar, F., Jalili, M.: A novel multivariate filter method for feature selection in text classification problems. Eng. Appl. Artif. Intell. 70, 25–37 (2018)
Lasi, H., Fettke, P., Kemper, H.-G., Feld, T., Hoffmann, M.: Industry 4.0. Bus. Inf. Syst. Eng. 6(4), 239–242 (2014)
LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436 (2015)
Liu, H., Zhou, M.C., Liu, Q.: An embedded feature selection method for imbalanced data classification. IEEE/CAA J. Automatica Sinica 6(3), 703–715 (2019)
Liu, Z.-Z., Huang, J.-W., Wang, Y., Cao, D.-S.: ECoFFeS: a software using evolutionary computation for feature selection in drug discovery. IEEE Access 6, 20950–20963 (2018)
Meng, L.: Embedded feature selection accounting for unknown data heterogeneity. Expert Syst. Appl. 119, 350–361 (2019)
Mafarja, M., Mirjalili, S.: Whale optimization approaches for wrapper feature selection. Appl. Soft Comput. 62, 441–453 (2018)
Mallik, P., Roy, C., Maheshwari, E., Pandey, M., Rautray, S.: Analyzing student performance using data mining. In: Hu, Y.-C., Tiwari, S., Mishra, K.K., Trivedi, M.C. (eds.) Ambient Communications and Computer Systems. AISC, vol. 904, pp. 307–318. Springer, Singapore (2019). https://doi.org/10.1007/978-981-13-5934-7_28
Moro, S., Cortez, P., Rita, P.: A data-driven approach to predict the success of bank telemarketing. Decis. Support Syst. 62, 22–31 (2014)
Osanaiye, O., Cai, H., Choo, K.-K.R., Dehghantanha, A., Xu, Z., Dlodlo, M.: Ensemble-based multi-filter feature selection method for DDOS detection in cloud computing. EURASIP J. Wirel. Commun. Netw. 1, 130 (2016)
Parlar, T., Acaravci, S.K.: Using data mining techniques for detecting the important features of the bank direct marketing data. Int. J. Econ. Fin. Issues 7(2), 692–696 (2017)
Pedregosa, F., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
Pullanagari, R., Kereszturi, G., Yule, I.: Integrating airborne hyperspectral, topographic, and soil data for estimating pasture quality using recursive feature elimination with random forest regression. Rem. Sens. 10(7), 1117 (2018)
Ramjee, S., Gamal, A.E.: Efficient wrapper feature selection using autoencoder and model based elimination. arXiv preprint arXiv:1905.11592 (2019)
Scherer, M., Smolag, J., Gaweda, A.: Predicting success of bank direct marketing by neuro-fuzzy systems. In: Rutkowski, L., Korytkowski, M., Scherer, R., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds.) ICAISC 2016. LNCS (LNAI), vol. 9693, pp. 570–576. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-39384-1_50
Serrano-Silva, Y.O., Villuendas-Rey, Y., Yáñez-Márquez, C.: Automatic feature weighting for improving financial decision support systems. Decis. Support Syst. 107, 78–87 (2018)
Simon, D.: Evolutionary Optimization Algorithms. Wiley, Hoboken (2013)
Srinivasan, U., Arunasalam, B.: Leveraging big data analytics to reduce healthcare costs. IT Prof. 15(6), 21–28 (2013)
Storn, R., Price, K.: Differential evolution-a simple and efficient heuristic for global optimization over continuous spaces. J. Glob. Optim. 11(4), 341–359 (1997)
Viktorin, A., Senkerik, R., Pluhacek, M., Kadavy, T., Zamuda, A.: Distance based parameter adaptation for success-history based differential evolution. Swarm Evol. Comput. 50, 100462 (2018)
Wang, S., Tang, J., Liu, H.: Embedded unsupervised feature selection. In: Twenty-Ninth AAAI Conference on Artificial Intelligence (2015)
Wang, Y., Huang, J.-J., Zhou, N., Cao, D.-S., Dong, J., Li, H.-X.: Incorporating PLS model information into particle swarm optimization for descriptor selection in QSAR/QSPR. J. Chemom. 29(12), 627–636 (2015)
Xindong, W., Zhu, X., Gong-Qing, W., Ding, W.: Data mining with big data. IEEE Trans. Knowl. Data Eng. 26(1), 97–107 (2014)
Xue, B., Zhang, M., Browne, W.N., Yao, X.: A survey on evolutionary computation approaches to feature selection. IEEE Trans. Evol. Comput. 20(4), 606–626 (2016)
Yu, L., Liu, H.: Feature selection for high-dimensional data: a fast correlation-based filter solution. In: Proceedings of the 20th International Conference on Machine Learning (ICML-03), pp. 856–863 (2003)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Fister, D., Fister, I., Jagrič, T., Fister, I., Brest, J. (2020). Wrapper-Based Feature Selection Using Self-adaptive Differential Evolution. In: Zamuda, A., Das, S., Suganthan, P., Panigrahi, B. (eds) Swarm, Evolutionary, and Memetic Computing and Fuzzy and Neural Computing. SEMCCO FANCCO 2019 2019. Communications in Computer and Information Science, vol 1092. Springer, Cham. https://doi.org/10.1007/978-3-030-37838-7_13
Download citation
DOI: https://doi.org/10.1007/978-3-030-37838-7_13
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-37837-0
Online ISBN: 978-3-030-37838-7
eBook Packages: Computer ScienceComputer Science (R0)