Abstract
In this digital world, most of the communication is done only through the Internet. Email is widely used for exchanging information not only for personal communication but also has an important part in business communication because of its effectiveness, fastness, and cost-effective mode of communication. Spam email is the serious problem on the Internet; when users click on to the spam mail, it starts spreading viruses in the user system, consumes lot of network bandwidth and email storage space, and steals user’s confidential data. Feature selection approach selects the best features from the dataset which removes irrelevant, redundant, and noisy data. The proposed paper offers email spam detection which incorporates various feature selection approaches like Information Gain, Correlation-Based Feature Selection, Genetic Algorithm, Ant Colony Optimization, Artificial Bee Colony, Particle Swarm Optimization, Cuckoo Search Algorithm, Harmony Search Algorithm, etc.; when classification is done after feature selection, it will enhance the performance of spam filtering.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Abbreviations
- ABC:
-
Artificial Bee Colony Optimization
- BoW:
-
Bag-of-Word
- CGA:
-
Compact Genetic Algorithm
- EA:
-
Evolutionary Algorithm
- F-GSO:
-
Firefly-Group Search Optimizer
- HKSVM:
-
Hybrid Kernel based Support Vector Machine
- HSA:
-
Harmony Search Algorithm
- KNN:
-
K-Nearest Neighbors
- LR:
-
Logistic Regression
- MLP:
-
Multi-Layer Perceptron
- MLP-NN:
-
Multi-Layer Perceptron Neural Network
- NB:
-
Naïve Bayes
- PCA:
-
Principal Component Analysis
- PNN:
-
Probabilistic Neural Network
- PoS:
-
Part-of-Speech
- PSO:
-
Particle Swarm Optimization
- PU:
-
Positive Unlabeled
- RF:
-
Random Forests
- SCS:
-
Stepsize Cuckoo Search
- SMO:
-
Sequential Minimal Optimization
- SVM:
-
Support Vector Machine
- TF-IDF:
-
Term Frequency – Inverse Document Frequency
- TREC:
-
Text Retrieval Conference
- UCI:
-
UC Irvine
References
Xue B, Zhang M, Browne WN, Yao X (2016) A survey on evolutionary computation approaches to feature selection. IEEE Trans Evol Comput 20(4):606–626
Jain D, Singh V (2018) Feature selection and classification systems for chronic disease prediction: a review. Egypt Inform J Elsevier 19(3):179–189
Bhuiyan H, Ashiquzzaman A, Juthi TI, Biswas S, Ara J (2018) A survey of existing E-mail spam filtering methods considering machine learning techniques. Global J Comp Sci Technol C Softw Data Eng 18(2):21–29
Mujtaba G, Shuib L, Raj RG, Majeed N, Al-Garadi MA (2017) Email classification research trends: review and open issues, IEEE Access, pp 9044–9064
Sharma V, Poriye M, Kumar V (2017) Various classifiers with optimal feature selection for- email spam filtering. Int J Comput Sci Commun 8(2):18–22
Varghese R, Dhanya KA (2017) Efficient feature set for spam email filtering. IEEE 7th international advance computing conference, pp 732–737
Dagher I, Antoun R (2017) Ham – Spam Filtering using Kernel PCA. Int J Comput Commun 11:38–44
Mehdi Zekriyapanah Gashti (2017) FHSA. Eng Technol Appl Sci Res 7(3):1713–1718
Kaur H, Prince Verma E (2017) E-mail spam detection using refined MLP with feature selection. Int J Modern Educ Comput Sci 9:42–52
Esmaeili M, Arjomandzadeh A, Shams R, Zahedi M (2017) An anti-spam system using naive Bayes method and feature selection methods. Int J Comput Appl 165(4):1–5
Kumaresan T, Palanisamy C (2017) E-mail spam classification using S-cuckoo search and support vector machine. Int J Bio-Inspired Comput 9(3):142–156
Rathore SK, Yada S (2017) A hybrid Bayesian approach with ABC to recognition of email SPAM. Int J Comput Sci Mob Comput 6(5):459–466
Shradhanjali, Verma T (2017) E-mail spam detection and classification using SVM and feature extraction. Int J Adv Res Ideas Innov Technol 3(3)
Kumaresan T, Saravanakumar S, Balamurugan R (2017) Visual and textual features based email spam classification using S-cuckoo search and hybrid kernel support vector machine. Clust Comput 22:33–46. Springer Publication
Renuka DK, Visalakshi P (2017) Weighted-based multiple classifier and F-GSO algorithm for email spam classification. Int J Business Intelligence Data Mining 12(3):274–298
Zavvar M, Rezaei M, Garavand S (2016) Email spam detection using combination of particle swarm optimization and artificial neural network and support vector machine. Int J Modern Education Computer Science (IJMECS) 8(7):68–74
Karthika Renuka D, Visalakshi P, Sankar T (2015) Improving E-mail spam classification using ant Colony optimization algorithm. Int J Comput Appl 22–26
Mohamad M, Selamat A (2015) An evaluation on the efficiency of hybrid feature selection in spam email classification. IEEE international conference on computer, communication, and control. Technology 227–231
Kumar S, Arumugam S (2015) A probabilistic neural network based classification of spam mails using particle swarm optimization feature selection. Middle-East J Sci Res 23(5):874–879
Kalaibar SM, Razavi SN (2014) Spam filtering by using genetic based feature selection. Int J Comput Appl Technol Res 3(12):839–843
Acknowledgments
Our sincere thanks to the University Grants Commission (UGC), Hyderabad, for granting the funds to carry out this work.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Vinitha, V.S., Renuka, D.K. (2020). Feature Selection Techniques for Email Spam Classification: A Survey. In: Kumar, L., Jayashree, L., Manimegalai, R. (eds) Proceedings of International Conference on Artificial Intelligence, Smart Grid and Smart City Applications. AISGSC 2019 2019. Springer, Cham. https://doi.org/10.1007/978-3-030-24051-6_86
Download citation
DOI: https://doi.org/10.1007/978-3-030-24051-6_86
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-24050-9
Online ISBN: 978-3-030-24051-6
eBook Packages: EngineeringEngineering (R0)