Skip to main content

A Personal Antispam System Based on a Behaviour-Knowledge Space Approach

  • Chapter

Part of the book series: Studies in Computational Intelligence ((SCI,volume 245))

Abstract

In their daily work and common life, people suffer serious problems with Unsolicited Commercial E-mails (UCE), commonly known as spam: common people, small companies and large public or private institutions feel that spam has weakened the reliability and effectiveness of email as an efficient tool for communicating. To establish simple, fast and effective countermeasures against spam attacks is a necessary strategy of a modern mailing management system. In this chapter we describe a novel method for detecting spam messages, analyzing both text and image attached components. In particular, we describe an architecture for deploying a personal antispam system able to overcome some problems that are still besetting the state-of-the-art spam filters. Text analysis is accomplished by considering recent advances in both semantic and syntactic analysis; in addition, spammers tricks based on images are also taken into account. A Behaviour Knowledge Space approach for fusing the different results coming from the analysis of the different parts of the e-mails enhances the performance of the proposed system, as described by the experiments we have carried out.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Androutsopoulos, I., Koutsias, J., Chandrinos, K.V., Paliouras, G., Spyropoulos, C.D.: An evaluation of Naive Bayesian anti-spam filtering. In: Lopez de Mantaras, R., Plaza, E. (eds.) ECML 2000. LNCS (LNAI), vol. 1810, pp. 9–17. Springer, Heidelberg (2000)

    Google Scholar 

  2. Aradhye, H.B., Myers, G.K., Herson, J.A.: Image analysis for efficient categorization of image-based spam e-mail. In: Proc. 8th Int. Conf. Document Analysis and Recogn, Seoul, Korea, pp. 914–918. IEEE Comp. Soc., Los Alamitos (2005)

    Google Scholar 

  3. Balakumar, M., Vaidehi, V.: Ontology based classification and categorization of email. In: Proc. Int. Conf. Sign. Proc., Communications and Networking, Chennai, India, pp. 199–202. IEEE Comp. Soc., Los Alamitos (2008)

    Chapter  Google Scholar 

  4. Biggio, B., Fumera, G., Roli, F.: Adversarial pattern classification using multiple classifiers and randomisation. In: da Vitoria Lobo, N., Kasparis, T., Roli, F., Kwok, J.T., Georgiopoulos, M., Anagnostopoulos, G.C., Loog, M. (eds.) S+SSPR 2008. LNCS, vol. 5342, pp. 500–509. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  5. Biggio, B., Fumera, G., Pillai, I., Roli, F.: Image spam filtering using visual information. In: Cucchiara, R. (ed.) Proc. 14th Int. Conf. Image Analysis and Proc., Modena, Italy, pp. 105–110. IEEE Comp. Soc., Los Alamitos (2007)

    Google Scholar 

  6. Blanzieri, E., Bryl, A.: A survey of learning-based techniques of email spam filtering. TR DIT-06-056, Informatica e Telecomunicazioni, University of Trento, Italy (2006)

    Google Scholar 

  7. Cheng, H., Qin, Z., Liu, Q., Wan, M.: Spam image discrimination using support vector machine based on higher-order local autocorrelation feature extraction. In: Proc. IEEE Conf. Cybern. Intell. Syst., Chendgu, China, pp. 1017–1021. IEEE Comp. Soc., Los Alamitos (2008)

    Chapter  Google Scholar 

  8. Cohen, W.W.: Learning rules that classify e-mail. In: Proc. AAAI Spring Symp. Mach. Learn. in Inf. Access, pp. 18–25. AAAI Press, Menlo Park (1996)

    Google Scholar 

  9. Damiani, E., De Capitani di Vimercati, S., Paraboschi, S., Samarati, P.: P2P-based collaborative spam detection and filtering. In: Caronni, G., Weiler, N., Shahmehri, N. (eds.) Proc. 4th Int. Conf. Peer-to-Peer Computing, Zurich, Switzerland, pp. 176–183. IEEE Comp. Soc., Los Alamitos (2004)

    Chapter  Google Scholar 

  10. Deerwester, S., Dumais, S.T., Furnas, G.W., Landauer, T.K., Harshman, R.: Indexing by latent semantic analysis. J. Amer. Soc. Inf. Sci. 41(6), 391–407 (1990)

    Article  Google Scholar 

  11. Dredze, M., Gevaryahu, R., Elias-Bachrach, A.: Learning fast classifiers for image spam. In: Proc. 4th Conf. Email and Anti-Spam, Mountain View, CA, pp. 487–493 (2007)

    Google Scholar 

  12. Drucker, H., Wu, D., Vapnik, V.N.: Support vector machines for spam categorization. IEEE Trans. Neural Networks 10(5), 1048–1054 (1999)

    Article  Google Scholar 

  13. Fumera, G., Pillai, I., Roli, F.: Spam filtering based on the analysis of text information embedded into images. J. Mach. Learn. Research 7, 2699–2720 (2006)

    Google Scholar 

  14. Gao, Y., Yang, M., Zhao, X., Pardo, B., Wu, Y., Pappas, T.N., Choudhary, A.: Image spam hunter. In: Proc. IEEE Int. Conf. Acoustics, Speech and Sign. Proc., Las Vegas, NV, pp. 1765–1768. IEEE Comp. Soc., Los Alamitos (2008)

    Google Scholar 

  15. Gargiulo, F., Sansone, C.: Visual and OCR-based features for detecting image spam. In: Juan-Císcar, A., Sánchez-Albaladejo, G. (eds.) Proc. 8th Int. Workshop Patt. Recogn. Inf. Syst., Barcelona, Spain, pp. 154–163. INSTICC Press, Setúbal (2008)

    Google Scholar 

  16. Han, A., Kim, H.-J., Ha, I., Jo, G.-S.: Semantic analysis of user behaviors for detecting spam mail. In: Proc. IEEE Int. Workshop Semantic Computing and Appl., Incheon, Korea, pp. 91–95. IEEE Comp. Soc., Los Alamitos (2008)

    Chapter  Google Scholar 

  17. Haralick, R.M.: Statistical and structural approaches to texture. Proceedings of IEEE 67(5), 786–804 (1979)

    Article  Google Scholar 

  18. Huang, H., Guo, W., Zhang, Y.: A novel method for image spam filtering. In: Proc. 9th Int. Conf. Young Comp. Scientists, Zhang Jia Jie, Hunan, China, pp. 826–830. IEEE Comp. Soc., Los Alamitos (2008)

    Chapter  Google Scholar 

  19. Huang, Y.S., Suen, C.Y.: A method of combining multiple experts for the recognition of unconstrained handwritten numerals. IEEE Trans. Pattern Analysis and Mach. Intell. 17(1), 90–94 (1995)

    Article  Google Scholar 

  20. Jurafsky, D., Martin, J.H.: Speech and Language Processing: An Introduction to Natural Language Processing. In: Computational Linguistics and Speech Recognition. Prentice Hall, Upper Saddle River (2009)

    Google Scholar 

  21. Liu, W., Fang, W.: Adaptive spam filtering based on fingerprint vectors. In: Proc. ISECS Int. Colloquium Computing, Communication, Control, and Management, Guangzhou, China, pp. 384–388. IEEE Comp. Soc., Los Alamitos (2008)

    Chapter  Google Scholar 

  22. Lochbaum, K.E., Streeter, L.A.: Comparing and combining the effectiveness of latent semantic indexing and the ordinary vector space model for information retrieval. Inf. Proc. and Management 25(6), 665–676 (1989)

    Article  Google Scholar 

  23. Manning, C., Schuetze, H.: Foundations of Statistical Natural Language Processing. MIT Press, Cambridge (1999)

    MATH  Google Scholar 

  24. Metsis, V., Androutsopoulos, I., Paliouras, G.: Spam filtering with Naive Bayes – which Naive Bayes? In: Proc. 3rd Conf. Email and Anti-Spam, Mountain View, CA (2006)

    Google Scholar 

  25. Okabe, M., Yamada, S.: Interactive spam filtering with active learning and feature selection. In: Proc. IEEE/WIC/ACM Int. Conf. Web Intell. and Intell. Agent Technology, Sydney, NSW, Australia, pp. 165–168. IEEE Comp. Soc., Los Alamitos (2008)

    Chapter  Google Scholar 

  26. Porter, M.F.: An algorithm for suffix stripping. Program 14(3), 130–137 (1980)

    Google Scholar 

  27. Schryen, G.: Anti-Spam Measures: Analysis and Design. Springer, New York (2007)

    Google Scholar 

  28. Wan, M., Zhang, F., Cheng, H., Liu, Q.: Text localization in spam image using edge features. In: Proc. Int. Conf. Communications, Circuits and Syst., Fujian, China, pp. 838–842. IEEE Comp. Soc., Los Alamitos (2008)

    Google Scholar 

  29. Wu, C.T., Cheng, K.T., Zhu, Q.A., Wu, Y.L.: Using visual features for anti-spam filtering. In: Proc. IEEE Conf. Image Processing, Genoa, Italy, pp. 509–512. IEEE Comp. Soc., Los Alamitos (2005)

    Google Scholar 

  30. Zhou, F., Zhuang, L., Zhao, B.Y., Huang, L., Joseph, A.D., Kubiatowicz, J.: Approximate object location and spam filtering on peer-to-peer systems. In: Endler, M., Schmidt, D.C. (eds.) Middleware 2003. LNCS, vol. 2672, pp. 1–20. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Gargiulo, F., Penta, A., Picariello, A., Sansone, C. (2009). A Personal Antispam System Based on a Behaviour-Knowledge Space Approach. In: Okun, O., Valentini, G. (eds) Applications of Supervised and Unsupervised Ensemble Methods. Studies in Computational Intelligence, vol 245. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-03999-7_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-03999-7_3

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-03998-0

  • Online ISBN: 978-3-642-03999-7

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics