Advertisement

Feature Selection Techniques for Email Spam Classification: A Survey

  • V. Sri Vinitha
  • D. Karthika Renuka
Conference paper
  • 45 Downloads

Abstract

In this digital world, most of the communication is done only through the Internet. Email is widely used for exchanging information not only for personal communication but also has an important part in business communication because of its effectiveness, fastness, and cost-effective mode of communication. Spam email is the serious problem on the Internet; when users click on to the spam mail, it starts spreading viruses in the user system, consumes lot of network bandwidth and email storage space, and steals user’s confidential data. Feature selection approach selects the best features from the dataset which removes irrelevant, redundant, and noisy data. The proposed paper offers email spam detection which incorporates various feature selection approaches like Information Gain, Correlation-Based Feature Selection, Genetic Algorithm, Ant Colony Optimization, Artificial Bee Colony, Particle Swarm Optimization, Cuckoo Search Algorithm, Harmony Search Algorithm, etc.; when classification is done after feature selection, it will enhance the performance of spam filtering.

Keywords

Feature selection Information gain Genetic algorithm Artificial bee colony Ant colony optimization Particle swarm optimization Cuckoo search algorithm Harmony search algorithm 

Abbreviations

ABC

Artificial Bee Colony Optimization

BoW

Bag-of-Word

CGA

Compact Genetic Algorithm

EA

Evolutionary Algorithm

F-GSO

Firefly-Group Search Optimizer

HKSVM

Hybrid Kernel based Support Vector Machine

HSA

Harmony Search Algorithm

KNN

K-Nearest Neighbors

LR

Logistic Regression

MLP

Multi-Layer Perceptron

MLP-NN

Multi-Layer Perceptron Neural Network

NB

Naïve Bayes

PCA

Principal Component Analysis

PNN

Probabilistic Neural Network

PoS

Part-of-Speech

PSO

Particle Swarm Optimization

PU

Positive Unlabeled

RF

Random Forests

SCS

Stepsize Cuckoo Search

SMO

Sequential Minimal Optimization

SVM

Support Vector Machine

TF-IDF

Term Frequency – Inverse Document Frequency

TREC

Text Retrieval Conference

UCI

UC Irvine

Notes

Acknowledgments

Our sincere thanks to the University Grants Commission (UGC), Hyderabad, for granting the funds to carry out this work.

References

  1. 1.
    Xue B, Zhang M, Browne WN, Yao X (2016) A survey on evolutionary computation approaches to feature selection. IEEE Trans Evol Comput 20(4):606–626CrossRefGoogle Scholar
  2. 2.
    Jain D, Singh V (2018) Feature selection and classification systems for chronic disease prediction: a review. Egypt Inform J Elsevier 19(3):179–189CrossRefGoogle Scholar
  3. 3.
    Bhuiyan H, Ashiquzzaman A, Juthi TI, Biswas S, Ara J (2018) A survey of existing E-mail spam filtering methods considering machine learning techniques. Global J Comp Sci Technol C Softw Data Eng 18(2):21–29Google Scholar
  4. 4.
    Mujtaba G, Shuib L, Raj RG, Majeed N, Al-Garadi MA (2017) Email classification research trends: review and open issues, IEEE Access, pp 9044–9064Google Scholar
  5. 5.
    Sharma V, Poriye M, Kumar V (2017) Various classifiers with optimal feature selection for- email spam filtering. Int J Comput Sci Commun 8(2):18–22Google Scholar
  6. 6.
    Varghese R, Dhanya KA (2017) Efficient feature set for spam email filtering. IEEE 7th international advance computing conference, pp 732–737Google Scholar
  7. 7.
    Dagher I, Antoun R (2017) Ham – Spam Filtering using Kernel PCA. Int J Comput Commun 11:38–44Google Scholar
  8. 8.
    Mehdi Zekriyapanah Gashti (2017) FHSA. Eng Technol Appl Sci Res 7(3):1713–1718Google Scholar
  9. 9.
    Kaur H, Prince Verma E (2017) E-mail spam detection using refined MLP with feature selection. Int J Modern Educ Comput Sci 9:42–52CrossRefGoogle Scholar
  10. 10.
    Esmaeili M, Arjomandzadeh A, Shams R, Zahedi M (2017) An anti-spam system using naive Bayes method and feature selection methods. Int J Comput Appl 165(4):1–5Google Scholar
  11. 11.
    Kumaresan T, Palanisamy C (2017) E-mail spam classification using S-cuckoo search and support vector machine. Int J Bio-Inspired Comput 9(3):142–156CrossRefGoogle Scholar
  12. 12.
    Rathore SK, Yada S (2017) A hybrid Bayesian approach with ABC to recognition of email SPAM. Int J Comput Sci Mob Comput 6(5):459–466Google Scholar
  13. 13.
    Shradhanjali, Verma T (2017) E-mail spam detection and classification using SVM and feature extraction. Int J Adv Res Ideas Innov Technol 3(3)Google Scholar
  14. 14.
    Kumaresan T, Saravanakumar S, Balamurugan R (2017) Visual and textual features based email spam classification using S-cuckoo search and hybrid kernel support vector machine. Clust Comput 22:33–46. Springer PublicationGoogle Scholar
  15. 15.
    Renuka DK, Visalakshi P (2017) Weighted-based multiple classifier and F-GSO algorithm for email spam classification. Int J Business Intelligence Data Mining 12(3):274–298CrossRefGoogle Scholar
  16. 16.
    Zavvar M, Rezaei M, Garavand S (2016) Email spam detection using combination of particle swarm optimization and artificial neural network and support vector machine. Int J Modern Education Computer Science (IJMECS) 8(7):68–74CrossRefGoogle Scholar
  17. 17.
    Karthika Renuka D, Visalakshi P, Sankar T (2015) Improving E-mail spam classification using ant Colony optimization algorithm. Int J Comput Appl 22–26Google Scholar
  18. 18.
    Mohamad M, Selamat A (2015) An evaluation on the efficiency of hybrid feature selection in spam email classification. IEEE international conference on computer, communication, and control. Technology 227–231Google Scholar
  19. 19.
    Kumar S, Arumugam S (2015) A probabilistic neural network based classification of spam mails using particle swarm optimization feature selection. Middle-East J Sci Res 23(5):874–879Google Scholar
  20. 20.
    Kalaibar SM, Razavi SN (2014) Spam filtering by using genetic based feature selection. Int J Comput Appl Technol Res 3(12):839–843Google Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  • V. Sri Vinitha
    • 1
  • D. Karthika Renuka
    • 2
  1. 1.Bannari Amman Institute of TechnologySathyamangalam, ErodeIndia
  2. 2.PSG College of TechnologyCoimbatoreIndia

Personalised recommendations