Reducing Classification Times for Email Spam Using Incremental Multiple Instance Classifiers
Combating spam emails is both costly and time consuming. This paper presents a spam classification algorithm that utilizes both majority voting and multiple instance approaches to determine the resulting classification type. By utilizing multiple sub-classifiers, the classifier can be updated by replacing an individual sub-classifier. Furthermore, each sub-classifier represents a small fraction of a typical classifier, so it can be trained in less time with less data as well. The TREC 2007 spam corpus was used to conduct the experiments.
KeywordsMultiple instance classifiers email spam Naïve Bayes TREC 2007 spam corpus
Unable to display preview. Download preview PDF.
- 4.Islam, M.R., Zhou, W., Chowdhury, M.U.: MVGL Analyser for Multi-Classifier Based Spam Filtering System. In: The Eighth IEEE/ACIS International Conference on Computer and Information Science (ICIS), pp. 394–399. IEEE Computer Society, Washington, DC (2009)Google Scholar
- 8.Yeh, C.-C., Chiang, S.-J.: Revisit Bayesian approaches for spam detection. In: Ninth International Conference for Young Computer Scientists (ICYCS), pp. 659–664. IEEE Computer Society, Washington, DC (2008)Google Scholar
- 9.SPAM Track Guidelines - TREC 2005-2007, http://plg.uwaterloo.ca/~gvcormac/spam/