Abstract
Nowadays e-mail has become a fast and economical way to exchange information. However, unsolicited or junk e-mail also known as spam quickly became a major problem on the Internet and keeping users away from them becomes one of the most important research area. Indeed, spam filtering is used to prevent access to undesirable e-mails. In this paper we propose a spam detection system called “3CA&1NB” which uses machine learning to detect spam. “3CA&1NB” has the characteristic of combining three cellular automata and one naïve Bayes algorithm. We discuss how the combination learning based methods can improve detection performances. Our preliminary results show that it can detect spam effectively.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Androutsopoulos, I., Koutsias, J.: An Evaluation of Naive Bayesian Networks. In: Machine Learning in the New Information Age, Barcelona, Spain, pp. 9–17 (2000)
Androutsopoulos, I., Paliouras, G., Karkaletsis, V., Sakkis, G., Spyropoulos, C.D., Stamatopoulos, P.: Learning to filter spam e-mail: a comparison of a naïve Bayesian and a memory based approach. In: Proc. Workshop on Machine Learning and Textual Information Access, PKDD, Lyon, France, pp. 1–13 (2000)
Atmani, B., Beldjilali, B.: Knowledge Discovery in Database: Induction Graph and Cellular Automaton. Computing and Informatics Journal 26, 171–197 (2007)
Awad, A., Polyvyanyy, A., Weske, M.: Semantic querying of business process models. In: Proc. International Conference on Enterprise Distributed Object Computing Conference, EDOC, pp. 85–94 (2008)
Barigou, N., Barigou, F., Atmani, B.: A Boolean model for spam detection. In: Proceedings of the International Conference on Communication, Computing and Control Applications, Tunisia, pp. 450–455 (2011)
Carreras, X., Marquez, L.: Boosting trees for anti-spam email filtering. In: 4th International Conference on Recent Advances in Natural Language Processing, Bulgaria, pp. 58–64 (2001)
Clark, J., Koprinska, I., Poon, J.: A neural network based approach to automated e-mail classification. In: IEEE International Conference on Web Intelligence, Halifax, Canada, pp. 702–705 (2003)
Cormack, G., Lynam, T.: Online supervised spam filter evaluation. ACM Transactions On Information Systems 25(3) (2007)
Dietterich, T.G.: Ensemble Methods in Machine Learning. In: Kittler, J., Roli, F. (eds.) MCS 2000. LNCS, vol. 1857, pp. 1–15. Springer, Heidelberg (2000)
Green, T.: How URL Spam Filtering Beats Bayesian/Heuristics Hands Down (2005), http://www.greenviewdata.com/documents/white_papers/ssh_url_filtering_white_paper.pdf (last date accessed: January 8, 2012)
Guzella, T.S., Caminhas, W.M.: A review of machine learning approaches to spam filtering. Expert Systems with Applications 36(7), 10206–10222 (2009)
Heron, S.: Technologies for spam detection. Network Security, 11–15 (2009)
Jung, J., Sit, E.: An empirical study of spam traffic and the use of DNS black lists. In: 4th ACM Conference on Internet Measurement, New York, USA, pp. 370–375 (2004)
Koprinska, I., Poon, J., Clarck, J., Chan, J.: Learning to classify e-mail. Information Sciences 177, 2167–2187 (2007)
Lai, C., Tsai, M.: An empirical performance comparison of machine learning methods for spam e-mail categorization. In: 4th International Conference on Hybrid Intelligent Systems, pp. 44-48 (2004)
Rios, G., Zha, H.: Exploring support vector machines and random forests for spam detection. In: First International Conference on Email and Anti Spam (CEAS), California, USA (2004)
Sahami, M., Dumais, S., Heckerman, D., Horvitz, E.: A Bayesian Approach to Filtering Junk E-Mail. In: Learning for Text Categorization, AAAI Technical Report WS-98-05 (1998)
Sakkis, G., Androutsopoulos, I., Paliouras, G., Karkaletsis, V.: Stacking classifiers for anti-spam filtering of e-mail. In: 6th Proceedings of Empirical Methods in Natural Language Processing, Pittsburgh, PA, pp. 44–50 (2001)
Santos, I., Laorden, C., Sanz, B., Bringas, P.G.: Enhanced Topic-based Vector Space Model for Semantics-aware Spam Filtering. Expert Systems with Applications 39(1), 437–444 (2012)
Sanz, E.P., Hidalgo, J.M., Perez, J.C.: Email spam filtering. In: Zelkowitz, M. (ed.) Advances in Computers, vol. 74, pp. 45–114 (2008)
Sebastiani, F.: Machine learning in automated text categorization. ACM Computing Surveys 34(1), 1–47 (2002)
Shih, D.H., Chiang, S., Lin, I.B.: Collaborative spam filtering with heterogeneous agents. Expert Systems with Applications 34(4), 1555–1566 (2008)
Schneider, K.: A comparison of event models for Naive Bayes anti-spam e-mail filtering. In: 10th Conference of the European Chapter of the Association for Computational Linguistics, pp. 307–314 (2003)
Subramaniam, T., Jalab, H., Taqa, A.Y.: Overview of textual anti-spam filtering techniques. International Journal of the Physical Sciences 5(12), 1869–1882 (2010)
Upasana, P., Chakraverty, S.: A review of text classification approaches for e-mail management. International Journal of Engineering and Technology 3(2), 137–144 (2011)
Valentini, G., Masulli, F.: Ensembles of Learning Machines. In: Marinaro, M., Tagliaferri, R. (eds.) WIRN 2002. LNCS, vol. 2486, pp. 3–19. Springer, Heidelberg (2002)
Vapnik, V.N., Druck, H., Wu, D.: Support Vector Machines for Spam Categorization. IEEE Transactions on Neural Networks 10(5), 1048–1054 (1999)
Zhang, I., Zhu, J., Yao, T.: An evaluation of statistical spam filtering techniques. ACM Transactions on Asian Language Information Processing 3(4), 243–269 (2004)
Yang, Y., Pedersen, J.O.: A comparative study on feature selection in text categorization. In: Fisher, D.H. (ed.) Proceedings of ICML 1997, 14th International Conference on Machine Learning, Nashville, US, pp. 412–420. Morgan Kaufmann Publishers (1997)
http://www.enisa.europa.eu/act/res/other-areas/anti-spam-measures/studies/spam-slides (last date accessed January 16, 2012)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Barigou, F., Barigou, N., Atmani, B. (2012). Combining Classifiers for Spam Detection. In: Benlamri, R. (eds) Networked Digital Technologies. NDT 2012. Communications in Computer and Information Science, vol 293. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-30507-8_8
Download citation
DOI: https://doi.org/10.1007/978-3-642-30507-8_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-30506-1
Online ISBN: 978-3-642-30507-8
eBook Packages: Computer ScienceComputer Science (R0)