Combining Scheduling Heuristics to Improve e-mail Filtering Throughput
In order to alleviate the massive increment of spam deliveries all over the world, spam-filtering service providers demand the development of new filtering schemes able to efficiently classify illegitimate content using less computational resources. As a consequence, several improvements have been introduced in rule-based filter platforms during the last years. In this context, different research works have shown the relevance of scheduling the evaluation of rules to improve filtering throughput, contributing some interesting heuristics to address this task. In this work, we introduce a novel scheduling approach that takes advantage of the combination of individual heuristics for globally improving filtering throughput.
Keywordsspam detection rule optimization schedulers prescheduling rules filtering throughput Wirebrush4SPAM framework
Unable to display preview. Download preview PDF.
- 1.Spam and Phishing Statistics Report Q1-2014, http://usa.kaspersky.com/internet-security-center/threats/spam-statistics-report-q1-2014 (accessed December 18, 2014)
- 2.Apache Software Foundation.: Spamassassin Spam Filter, http://spamassassin.apache.org (accessed December 18, 2014)
- 3.Metsis, V., Androutsopoulos, I., Paliouras, G.: Spam Filtering with Naïve Bayes - Which Naïve Bayes? In: 3rd Conference on E-mail and Anti Spam, California (2006)Google Scholar
- 4.Wong, M., Schlitt, W.: RFC4408 - Sender Policy Framework for Authorizing Use of Domains in E-Mail, Version 1, http://www.ietf.org/rfc/rfc4408.txt (accessed December 18, 2014)
- 5.DomainKeys Identified Mail (DKIM), http://www.dkim.org (accessed December 18, 2014)
- 6.Pérez-Díaz, N., Ruano-Ordás, D., Fdez-Riverola, F., Méndez, J.R.: Wirebrush4SPAM: a novel framework for improving efficiency on spam filtering services. Software Pract. Exper. 43(11), 1299–1318 (2013)Google Scholar
- 7.Ruano-Ordás, D., Fdez-Glez, J., Fdez-Riverola, F., Méndez, J.R.: Effective scheduling strategies for boosting performance on rule-based spam filtering frameworks. J. Syst. Software 86(12), 3151–3161 (2013)Google Scholar
- 8.Ruano-Ordas, D., Mendez, J.R.: Corpus of 200 E-mails, http://dx.doi.org/10.6084/m9.figshare.1326662 (accessed March 5, 2015)
- 9.Resnick, P.: RFC2822 - Internet Message Format, https://www.ietf.org/rfc/rfc2822.txt (accessed March 5, 2015)
- 10.Ruano-Ordas, D., Mendez, J.R.: Wirebrush4SPAM Trend Filters, http://dx.doi.org/10.6084/m9.figshare.1327581 (accessed March 5, 2015)