Advertisement

An Add-On to Rule-Based Sifters for Multi-recipient Spam Emails

  • Vipul Sharma
  • Puneet Sarda
  • Swasti Sharma
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3513)

Abstract

The Spam filtering technique described here targets multiple recipient Spam messages with similar email addresses. We exploit these similar patterns to create a rule-based classification system (accuracy 92%). Our technique uses the ‘TO’ and ‘CC’ fields to classify an email as Spam or Legitimate. We introduce certain new rules which should enhance the performance of the current filtering techniques [1][4][5]. We also introduce a novel metric to calculate the degree of similarity between a set of strings.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Parker, M.: Storing SpamAssassin User Data in SQL Databases, ApacheCon (2004) Google Scholar
  2. 2.
    Wu, D., Vapnik, V.: Support vector machine for text categorization (1998)Google Scholar
  3. 3.
    Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations. Academic Press, London (2000)Google Scholar
  4. 4.
    Androutsopoulos, I., Paliouras, G., Michelakis, E.: Learning to filter unsolicited comer cial e-mail. Technical Report, National Centre for Scientific Research Demokritos (2004)Google Scholar
  5. 5.
    Sakkis, G., Androutsopoulos, I., Paliouras, G., Karkaletsis, V., Spyropoulos, C.D., Stamatopoulos, P.: A memory bassed approach to anti-spam filtering for mailing lists. Information Retrieval 6(1), 49–73 (2003)CrossRefGoogle Scholar
  6. 6.
    Navarro, G.: A guided tour to approximate string matching. ACM Computing Surveys 33(1), 31–88 (2001)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2005

Authors and Affiliations

  • Vipul Sharma
    • 1
  • Puneet Sarda
    • 1
  • Swasti Sharma
    • 2
  1. 1.Department of Computer ScienceUniversity of HoustonHoustonUSA
  2. 2.Computer Science DepartmentCollege of Engineering RoorkeeRoorkeeIndia

Personalised recommendations