Abstract
We present an inclusive review of recent and successful content-based e-mail spam filtering techniques. Our focus is mainly on machine learning-based spam filters and variants inspired from them. We report on relevant ideas, techniques, taxonomy, major efforts, and the state-of-the-art in the field. The initial interpretation of the prior work examines the basics of e-mail spam filtering and feature engineering. We conclude by studying techniques, evaluation benchmarks, and explore the promising offshoots of latest developments and suggest lines of future investigations.
References
Radicati: Email Statistics Report, 2012–2016 Executive Summary. Technical Report 650, Radicati (2016)
Cormack, G.V.: Email spam filtering: a systematic review. Found. Trends Inf. Retrieval 1(4), 335–455 (2008)
Androutsopoulos, I., Paliouras, G., Michelakis, E.: Learning to filter unsolicited commercial e-mail. Technical Report in National Centre for Scientific Research Demokritos, Athens, Greece (2006)
Carpinter, J., Hunt, R.: Tightening the net: a review of current and next generation spam filtering tools. Comput Secur. 25(8), 566–578 (2006)
Blanzieri, E., Bryl, A.: A survey of learning-based techniques of email spam filtering. J. Artif. Intell. Rev. 29(1), 63–92 (2008)
Guzella, T.S., Caminhas, W.M.: A review of machine learning approaches to spam filtering. Expert Syst. Appl. 36(7), 10206–10222 (2009)
Wang, D., Irani, D., Pu, C.: A study on evolution of email spam over fifteen years. In: Proceedings of the 9th IEEE International Conference on Collaborative Computing: Networking, Applications and Work sharing (CollaborateCom), Austin, TX, USA (2013)
Vyas, T., Prajapati, P., Gadhwal, S.: A survey and evaluation of supervised machine learning techniques for spam e-mail filtering. In: Proceedings of IEEE International Conference on Electrical, Computer and Communication Technologies (ICECCT), IEEE, 5–7 March 2015
Alsmadi, I., Alhami, I.: Clustering and classification of email contents. J. King Saud Univ. Comput. Inf. Sci. 27, 46–57 (2015)
Li, W., Meng, W.: An empirical study on email classification using supervised machine learning in real environments. In: IEEE International Conference on Communications (ICC), IEEE, 8–12 June 2015
Sethi, H., Sirohi, H., Thakur, M.K.: Intelligent mail box. In: Proceedings of Third International Conference India 2016, vol. 3, pp. 441–450 (2016)
Zhang, L., Zhu, J., Yao, T.: An evaluation of statistical spam filtering techniques spam filtering as text categorization. ACM Trans. Asian Lang. Inf. Process. (TALIP) 3(4), 243–269 (2004)
Kanaris, I., Kanaris, K., Houvardas, I., Stamatatos, E.: Words versus character n-grams for anti-spam filtering. Int. J. Artif. Intell. Tools 16(6), 1–20 (2006)
Delany, S.J., Bridge, D.: Feature based and feature free textual CBR: a comparison in spam filtering. In: Proceedings of the 17th Irish Conference on Artificial Intelligence and Cognitive Science (AICS’06), pp. 244–253 (2006)
Yeh, C.Y., Wu, C.H., Doong, S.H.: Effective spam classification based on meta-heuristics. In: Proceedings of IEEE International Conference on Systems, Man and Cybernetics, pp. 3872–3877 (2005)
Diao, Y., Lu, H., Wu, D.: A comparative study of classification based personal e-mail filtering. In: Knowledge Discovery and Data Mining. Current Issues and New Applications, pp. 408–419 (2003)
M´endez, J.R., D´ıaz, F., Iglesias, E.L., Corchado, J.M.: A comparative performance study of feature selection methods for the anti-spam filtering domain. In: Advances in Data Mining. Applications in Medicine, Web Mining, Marketing, Image and Signal Mining, pp. 106–120. Springer, Berlin, Heidelberg (2006)
Tretyakov, K.: Machine learning techniques in spam filtering. In: Data Mining Problem-Oriented Seminar, MTAT.03.177, pp. 60–79 (2004)
Koprinska, I., Poon, J., Clark, J., Chan, J.: Learning to classify e-mail. Inf. Sci. 177(10), 2167–2187 (2007)
Sakkis, G., Androutsopoulos, I., Paliouras, G., Karkaletsis, V.: Stacking classifiers for anti-spam filtering of e-mail. In: Empirical methods in Natural Language Processing, pp. 44–50 (2001)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Bhowmick, A., Hazarika, S.M. (2018). E-Mail Spam Filtering: A Review of Techniques and Trends. In: Kalam, A., Das, S., Sharma, K. (eds) Advances in Electronics, Communication and Computing. Lecture Notes in Electrical Engineering, vol 443. Springer, Singapore. https://doi.org/10.1007/978-981-10-4765-7_61
Download citation
DOI: https://doi.org/10.1007/978-981-10-4765-7_61
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-4764-0
Online ISBN: 978-981-10-4765-7
eBook Packages: EngineeringEngineering (R0)