Reducing Classification Times for Email Spam Using Incremental Multiple Instance Classifiers

  • Teng-Sheng Moh
  • Nicholas Lee
Part of the Communications in Computer and Information Science book series (CCIS, volume 141)


Combating spam emails is both costly and time consuming. This paper presents a spam classification algorithm that utilizes both majority voting and multiple instance approaches to determine the resulting classification type. By utilizing multiple sub-classifiers, the classifier can be updated by replacing an individual sub-classifier. Furthermore, each sub-classifier represents a small fraction of a typical classifier, so it can be trained in less time with less data as well. The TREC 2007 spam corpus was used to conduct the experiments.


Multiple instance classifiers email spam Naïve Bayes TREC 2007 spam corpus 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Hoanca, B.: How good are our weapons in the spam wars? IEEE Technology and Society Magazine 25(1), 22–30 (2006)CrossRefGoogle Scholar
  2. 2.
    Carpinter, J., Hunt, R.: Tightening the net: a review of current and next generation spam filtering tools. Computers & Security 25(8), 566–578 (2006)CrossRefGoogle Scholar
  3. 3.
    Islam, M.R., Zhou, W.: An Innovative Analyser for email classification based on grey list analysis. In: 2007 IFIP International Conference on Network and Parallel Computing Workshops, pp. 176–182. IEEE Computer Society, Washington, DC (2007)CrossRefGoogle Scholar
  4. 4.
    Islam, M.R., Zhou, W., Chowdhury, M.U.: MVGL Analyser for Multi-Classifier Based Spam Filtering System. In: The Eighth IEEE/ACIS International Conference on Computer and Information Science (ICIS), pp. 394–399. IEEE Computer Society, Washington, DC (2009)Google Scholar
  5. 5.
    Kang, F., Naphade, M.R.: A generalized multiple instance learning algorithm with multiple selection strategies for cross granular learning. In: 2006 IEEE International Conference on Image Processing, pp. 3213–3216. IEEE Press, New York (2006)CrossRefGoogle Scholar
  6. 6.
    Zhou, Y., Jorgensen, Z., Inge, M.: Combating good word attacks on statistical spam filters with multiple instance learning. In: Nineteenth IEEE International Conference on Tools with Artificial Intelligence (ICTAI), pp. 298–305. IEEE Computer Society, Washington, DC (2007)CrossRefGoogle Scholar
  7. 7.
    Sirisanyalak, B., Sornil, O.: An artificial immunity-based spam detection system. In: 2007 IEEE Congress on Evolutionary Computation (CEC), pp. 3392–3398. IEEE Press, New York (2007)CrossRefGoogle Scholar
  8. 8.
    Yeh, C.-C., Chiang, S.-J.: Revisit Bayesian approaches for spam detection. In: Ninth International Conference for Young Computer Scientists (ICYCS), pp. 659–664. IEEE Computer Society, Washington, DC (2008)Google Scholar
  9. 9.
    SPAM Track Guidelines - TREC 2005-2007,
  10. 10.
    Islam, R., Zhou, W., Xiang, Y., Mahmood, A.N.: Spam filtering for network traffic security on a multi-core environment. Concurrency and Computation: Practice and Experience 21(10), 1307–1320 (2009)CrossRefGoogle Scholar
  11. 11.
    Tran, D., Ma, W., Sharma, D., Nguyen, T.: Possibility theory-based approach to spam email detection. In: 2007 IEEE International Conference on Granular Computing (GRC), p. 571. IEEE Computer Society, Washington, DC (2007)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Teng-Sheng Moh
    • 1
  • Nicholas Lee
    • 1
  1. 1.Department of Computer ScienceSan Jose State UniversitySan JoseU.S.A.

Personalised recommendations