Abstract
Feature selection is employed to diminish the number of features in various applications where data has more than hundreds of attributes. Essential or relevant attribute recognition has converted a vital job to utilize data mining algorithms efficiently in today’s world situations. Current feature selection techniques primarily concentrate on obtaining relevant attributes. This paper presents the notions of feature relevance, redundancy, evaluation criteria, and literature survey on the feature selection approaches in the different areas by many researchers. This paper supports to choose feature selection techniques without identifying the knowledge of every algorithm.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Abbreviations
- AA:
-
Average Accuracy
- AIC:
-
Akaike information criterion
- ANN:
-
Artificial Neural Network
- AUC:
-
Area under the Curve
- BWO:
-
Binary Wolf Optimization
- CART:
-
Classification and Regression Tree
- CFA:
-
Cuttlefish algorithm
- CFS:
-
Correlation-based Feature Selection
- CS:
-
Chi-Square
- DM:
-
Data Mining
- F:
-
F-Score
- FCBF:
-
Fast Correlation-based Feature selection
- FP:
-
False Positive
- GA:
-
Genetic Algorithm
- GR:
-
Gain Ratio
- IG:
-
Information Gain
- K-NN:
-
K-Nearest Neighbor
- LMT:
-
Logistic Model Tree
- MI:
-
Mutual Information
- MLP:
-
Multi-Layer Perceptron
- NB:
-
Naïve Bayes
- OA:
-
Overall Accuracy
- P:
-
Precision
- PCA:
-
Principal Component Analysis
- PSO:
-
Particle Swarm Optimization
- R:
-
Recall
- RBF:
-
Radial Basis Function
- ROC:
-
Receiver Operating Curve
- SVM:
-
Support Vector Machine
- TP:
-
True Positive
- TV:
-
Term Variance
- WOA:
-
Whale Optimization algorithm
References
Hong S-S, Lee W, Han M-M (2015) The feature selection method based on genetic algorithm for efficient of text clustering and text classification. Int J Adv Soft Comput Appl 7:1
Qian Y et al (2015) Fuzzy-rough feature selection accelerator. Fuzzy Sets Syst 258:61–78
Liang D, Tsai C-F, Wu H-T (2015) The effect of feature selection on financial distress prediction. Knowl-Based Syst 73:289–297
Aličković E, Subasi A (2017) Breast cancer diagnosis using GA feature selection and Rotation Forest. Neural Comput & Applic 28(4):753–763
Bharti KK, Singh PK (2015) Hybrid dimension reduction by integrating feature selection with feature extraction method for text clustering. Expert Syst Appl 42(6):3105–3114
Inbarani HH, Bagyamathi M, Azar AT (2015) A novel hybrid feature selection method based on rough set and improved harmony search. Neural Comput & Applic 26(8):1859–1880
Park CH, Kim SB (2015) Sequential random k-nearest neighbor feature selection for high-dimensional data. Expert Syst Appl 42(5):2336–2342
Han M, Ren W (2015) Global mutual information-based feature selection approach using single-objective and multi-objective optimization. Neurocomputing 168:47–54
Koutanaei FN, Sajedi H, Khanbabaei M (2015) A hybrid data mining model of feature selection algorithms and ensemble learning classifiers for credit scoring. J Retail Consum Serv 27:11–23
Eesa AS, Orman Z, Brifcani AMA (2015) A novel feature-selection approach based on the cuttlefish optimization algorithm for intrusion detection systems. Expert Syst Appl 42(5):2670–2679
Dessì N, Pes B (2015) Similarity of feature selection methods: An empirical study across data intensive classification tasks. Expert Syst Appl 42(10):4632–4642
Manek AS et al (2017) Aspect term extraction for sentiment analysis in large movie reviews using Gini Index feature selection method and SVM classifier. World Wide Web 20(2):135–154
Osanaiye O et al (2016) Ensemble-based multi-filter feature selection method for DDoS detection in cloud computing. EURASIP J Wirel Commun Netw 1(2016):130
Bagherzadeh-Khiabani F et al (2016) A tutorial on variable selection for clinical prediction models: feature selection methods in data mining could improve the results. J Clin Epidemiol 71:76–85
Emary E, Zawbaa HM, Hassanien AE (2016) Binary grey wolf optimization approaches for feature selection. Neurocomputing 172:371–381
Ghareb AS, Bakar AA, Hamdan AR (2016) Hybrid feature selection based on enhanced genetic algorithm for text categorization. Expert Syst Appl 49:31–47
Wan Y et al (2016) A feature selection method based on modified binary coded ant colony optimization algorithm. Appl Soft Comput 49:248–258
Xi M et al (2016) Cancer feature selection and classification using a binary quantum-behaved particle swarm optimization and support vector machine. Comput Math Methods Med 2016
Shen L et al (2016) Evolving support vector machines using fruit fly optimization for medical data classification. Knowl-Based Syst 96:61–75
Bharti KK, Singh PK (2016) Opposition chaotic fitness mutation based adaptive inertia weight BPSO for feature selection in text clustering. Appl Soft Comput 43:20–34
Apolloni J, Leguizamón G, Alba E (2016) Two hybrid wrapper-filter feature selection algorithms applied to high-dimensional microarray experiments. Appl Soft Comput 38:922–932
Mafarja MM, Mirjalili S (2017) Hybrid Whale Optimization Algorithm with simulated annealing for feature selection. Neurocomputing 260:302–312
Faris H et al (2017) A multi-verse optimizer approach for feature selection and optimizing SVM parameters based on a robust system architecture. Neural Comput & Applic:1–15
Thaseen IS, Kumar CA (2017) Intrusion detection model using fusion of chi-square feature selection and multi class SVM. J King Saud Univer Comp Inform Sci 29(4):462–472
Abualigah LM, Khader AT (2017) Unsupervised text feature selection technique based on hybrid particle swarm optimization algorithm with genetic operators for the text clustering. J Supercomput 73(11):4773–4795
Shunmugapriya P, Kanmani S (2017) A hybrid algorithm using ant and bee colony optimization for feature selection and classification (AC-ABC Hybrid). Swarm Evol Comput 36:27–36
Tharwat A, Hassanien AE, Elnaghi BE (2017) A ba-based algorithm for parameter optimization of support vector machine. Pattern Recogn Lett 93:13–22
Qi C et al (2017) Feature selection and multiple kernel boosting framework based on PSO with mutation mechanism for hyperspectral classification. Neurocomputing 220:181–190
Shrivastava P et al (2017) A survey of nature-inspired algorithms for feature selection to identify Parkinson’s disease. Comput Methods Program Biomed 139:171–179
Srisukkham W et al (2017) Intelligent leukaemia diagnosis with bare-bones PSO based feature optimization. Appl Soft Comput 56:405–419
Wang H, Niu B (2017) A novel bacterial algorithm with randomness control for feature selection in classification. Neurocomputing 228:176–186
Gu S, Cheng R, Jin Y (2018) Feature selection for high-dimensional classification using a competitive swarm optimizer. Soft Comput 22(3):811–822
Aljawarneh S, Aldwairi M, Yassein MB (2018) Anomaly-based intrusion detection system through feature selection analysis and building hybrid efficient model. J Comput Sci 25:152–160
Hancer E et al (2018) Pareto front feature selection based on artificial bee colony optimization. Inf Sci 422:462–479
Mafarja M et al (2018) Evolutionary population dynamics and grasshopper optimization approaches for feature selection problems. Knowl-Based Syst 145:25–45
Acharya N, Singh S (2018) An IWD-based feature selection method for intrusion detection system. Soft Comput 22(13):4407–4416
Cheruku R et al (2018) RST-BatMiner: a fuzzy rule miner integrating rough set feature selection and Bat optimization for detection of diabetes disease. Appl Soft Comput 67:764–780
Jain I, Jain VK, Jain R (2018) Correlation feature selection based improved-Binary Particle Swarm Optimization for gene selection and cancer classification. Appl Soft Comput 62:203–215
Chuang M-T, Hu Y-h, Lo C-L (2018) Predicting the prolonged length of stay of general surgery patients: a supervised learning approach. Int Trans Oper Res 25(1):75–90
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Durairaj, M., Poornappriya, T.S. (2020). Why Feature Selection in Data Mining Is Prominent? A Survey. In: Kumar, L., Jayashree, L., Manimegalai, R. (eds) Proceedings of International Conference on Artificial Intelligence, Smart Grid and Smart City Applications. AISGSC 2019 2019. Springer, Cham. https://doi.org/10.1007/978-3-030-24051-6_88
Download citation
DOI: https://doi.org/10.1007/978-3-030-24051-6_88
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-24050-9
Online ISBN: 978-3-030-24051-6
eBook Packages: EngineeringEngineering (R0)