Architecture of Adaptive Spam Filtering Based on Machine Learning Algorithms
Spam is commonly defined as unsolicited email messages and the goal of spam filtering is to distinguish between spam and legitimate email messages. Much work has been done to filter spam from legitimate emails using machine learning algorithm and substantial performance has been achieved with some amount of false positive (FP) tradeoffs. In the case of spam detection FP problem is unacceptable sometimes. In this paper, an adaptive spam filtering model has been proposed based on Machine learning (ML) algorithms which will get better accuracy by reducing FP problems. This model consists of individual and combined filtering approach from existing well known ML algorithms. The proposed model considers both individual and collective output and analyzes them by an analyzer. A dynamic feature selection (DFS) technique also proposed in this paper for getting better accuracy.
KeywordsMachine learning spam SVM NB FP
Unable to display preview. Download preview PDF.
- 1.Islam, R., Chowdhury, M., Zhou, W.: An Innovative Spam Filtering Model Based on Support Vector Machine. In: Proceedings of the IEEE International Conference on Intelligent Agents, Web Technologies and Internet Commerce, vol. 2, pp. 348–353 (2005)Google Scholar
- 2.Cristianini, N., Shawe-Taylor, J.: An introduction to Support Vector Machines and other kernel-based learning methods. Cambridge University Press, Cambridge (2000)Google Scholar
- 4.Kaitarai, H.: Filtering Junk e-mail: A performance comparison between genetic programming and naïve bayes. Tech. Report, Department of Electrical and Computer Engineering, University of Waterloo (November 1999)Google Scholar
- 5.Androutsopoulos, I., et al.: Learning to filter spam e-mail: A comparison of a Naive Bayesian and a memory-based approach. In: Proceedings of the Workshop on Machine Learning and Textual Information Access, 4th European Conference on Principles and Practice of Knowledge Discovery in Databases. Lyon, France, pp. 1–13 (2000)Google Scholar
- 6.Zhang, J., et al.: A Modified logistic regression: An approximation to SVM and its applications in large-scale text categorization. In: Proceedings of the 20th International Conference on Machine Learning, pp. 888–895. AAAI Press, California (2003)Google Scholar
- 7.Sahami, M., Dumais, S., Heckerman, D., Horvitz, E.: A bayesian approach to filtering junk e-mail. In Learning for Text Categorization. Papers from the Workshop, Madison, Wisconsin, AAAI Technical Report WS, pp. 98–105 (1998)Google Scholar