User Preference-Based Spamming Detection with Coupled Behavioral Analysis

Jiang, Frank; Tang, Mingdong; Tran, Quang Anh

doi:10.1007/978-3-319-49148-6_38

Frank Jiang¹⁷,
Mingdong Tang¹⁸ &
Quang Anh Tran¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 10066))

Included in the following conference series:

International Conference on Security, Privacy and Anonymity in Computation, Communication and Storage

1397 Accesses
1 Citations

Abstract

Nowadays, the explosive growth of unsolicited emails on Internet has been challenging the spam filtering systems when at the presence of big data. Current spam filters suffer from the following problems: (1) Not personalised; (2) Comparatively static association rules defined in the firewalls, or gateways; (3) Cannot identify the extremely hidden information that mixed in the syntax or semantics. To overcome these problems, we develop and implement a new email spamming system leveraged by coupled text similarity analysis on user preference and a virtual meta-layer user-based email network, we take the social networks or campus LANs as the spam social network scenario. Fewer current practices exploit social networking initiatives to assist in spam filtering. Social network has essentially a large number of accounts features to be considered.

We construct a new model called meta-layer email network which can reduce these features by only considering individual user’s actions i.e., replying network, reading network and deleting network. For the first time, these common user actions are considered to construct a social behavior-based email network. Further, a coupled selection model is developed for this email network, we are able to consider all relevant factors/features in a whole and recommend the emails practically to the user individually. The experiment data comes from the Enron email dataset, which has been recognized as a representative dataset for testing and validation. The experimental results show the new approach can achieve higher precision and accuracy with better email ranking in favor of personalised preference.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Steve, W.: Email overload: exploring personal information management of email. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems: Common Ground, vol. 96, no. 1, pp. 276–283 (1996)
Google Scholar
Nicholas, K.: Automated email activity management: an unsupervised learning approach. In: Proceedings of the 10th International Conference on Intelligent User Interfaces, vol. 5, no. 1, pp. 67–74 (2005)
Google Scholar
Anirban, D.: Enhanced email spam filtering through combining similarity graphs. In: Proceedings of the Fourth ACM International Conference on Web Search and Data Mining, vol. 11, no. 1, pp. 785–794 (2011)
Google Scholar
Khurum, N.J.: Automatic Personalized spam filtering through significant word modeling. In: ICTAI 2007, Proceedings of the 19th IEEE International Conference on Tools with Artificial Intelligence, vol. 2, no. 1, pp. 291–298 (2007)
Google Scholar
Yang, Y., Yoo, S., Lin, F.: Personalized email prioritization based on content and social network analysis. IEEE Intell. Syst. 25(4), 12–18 (2010)
Article Google Scholar
Paul-Alexandru, C, Jörg, D, Wolfgang, N.: MailRank: using ranking for spam detection. In: CIKM 2005, Proceedings of the 14th ACM International Conference on Information and Knowledge Management, vol. 5, no. 1, pp. 373–380 (2005)
Google Scholar
Mingjun, L., Wanlei, Z.: Spam filtering based on preference ranking. In: CIT 2005 Proceedings of the Fifth International Conference on Computer and Information Technology, vol. 5, no. 1, pp. 223–227 (2005)
Google Scholar
Graham, P.: A plan for spam. Web document (2002). http://www.paulgraham.com/spam.html
Androutsopoulos, I., Koutsias, J., Chandrinos, K.V., Paliouras, G., Spyropoulos, C.D.: An evaluation of Naive Bayesian anti-spam filtering. In: Proceedings of the Workshop on Machine Learning in the New Information Age, 11th European Conference on Machine Learning, Barcelona, Spain, pp. 9–17 (2000)
Google Scholar
Ion, A.: An experimental comparison of naive Bayesian and keyword-based anti-spam filtering with personal e-mail messages. In: Proceedings of the 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2000, no. 1, pp. 160–167 (2000)
Google Scholar
Manu, K: Building Search Applications: Lucene, LingPipe, and Gate, p. 22. MustruPublising, US
Google Scholar
LIBSVM: LIBSVM - A Library for Support Vector Machines. http://www.csie.ntu.edu.tw/~cjlin/libsvm/. Accessed 10 July 2012
Chih-Wei, H.: A comparison of methods for multiclass support vector machines. IEEE Trans. Neural Netw. 2(13), 415–425 (2002)
Article Google Scholar
Thorsten J.: Text Categorization with support vector machines: learning with many relevant features. In: Kunstliche Intelligenz 1997 (2008). Manning, C.D
Google Scholar
Boykin, P.O., Roychowdhury, V.: Leveraging social networks to fight spam. IEEE Comput. 38(4), 61–68 (2004). Sorting e-mail friends from foes. Nature news (2005)
Article Google Scholar
Hanoi University website. http://www.hanu.edu.vn
Brin, S., Page, L.: The anatomy of a large-scale hypertextual web search engine. In: Proceedings of the 7th International Conference on World Wide Web (WWW), Brisbane, Australia, pp. 107–117 (1998)
Google Scholar
Xing W., Ghorbani A.: Weighted PageRank algorithm. In: Proceedings of the Second Annual Conference on Communication Networks and Services Research, pp. 305–314 (2004)
Google Scholar
Bui, N.L., Tran, Q.A., Ha, Q.T.: User’s authentic rating based on email networks, In: Proceedings of the First International Conference on Mobile Computing, Communications and Applications (ICMOCCA 2006), pp. 144–148 (2006)
Google Scholar
Ebel, H., Mielsch, L.I., Bornholdt, S.: Scale-free topology of email networks. Phys. Rev. E 66, 035103(R) (2002)
Article Google Scholar
Newman, M.E.J., Watts, D.J.: Renormalization group analysis of the small-world network model. Phys. Lett. A 263, 341–346 (1999)
Article MathSciNet MATH Google Scholar
Hromada, D.: Quantitative intercultural comparison by means of parallel page ranking of diverse national wikipedias. In: Proceedings of JADT (2010)
Google Scholar
Chirita, P., Diederich, J., Nejdl, W.: MailRank: using ranking for spam detection, In: Proceedings of the 14th ACM International Conference on Information and Knowledge Management, pp. 373–380 (2005)
Google Scholar
Tran, Q.A., Vu, M.T., Jiang, F.: Email user ranking based on email networks. In: American Institute of Physics, Conference Proceedings, vol. 1479, pp. 1512–1517. ICNAAM (2012). doi:10.1063/1.4756451
Ha, Q.M., Phung, V.D., Jiang, F. Nguyen, Q.L.: Image spam filtering based on maximum entropy segmentation method. In: Proceeding of 7th International Conference on Broadband Communications and Biomedical Applications (IB2COM 2012), pp. 147–151 (2012)
Google Scholar
Vu, M.T., Tran, Q.A., Jiang, F., Tran, V.Q.: Multilingual rules for spam detection. In: Proceeding of 7th International Conference on Broadband Communications and Biomedical Applications, pp. 106–110 (2012)
Google Scholar

Download references

Author information

Authors and Affiliations

Faculty of Engineer of IT, University of Technology Sydney, Sydney, 2007, Australia
Frank Jiang
School of Computer Science, Hunan University Science and Technology, Hunan, 411201, China
Mingdong Tang
Posts and Telecommunications Institute of Technology, Hanoi, 10000, Vietnam
Quang Anh Tran

Authors

Frank Jiang
View author publications
You can also search for this author in PubMed Google Scholar
Mingdong Tang
View author publications
You can also search for this author in PubMed Google Scholar
Quang Anh Tran
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mingdong Tang .

Editor information

Editors and Affiliations

Guangzhou University, Guangzhou, China
Guojun Wang
Department of Computer Science, Colorado State University, Fort Collins, Colorado, USA
Indrakshi Ray
University of the West of Scotland, Paisley, Glasgow, United Kingdom
Jose M. Alcaraz Calero
Indian Institute of Information Technology and Management, Kerala (IIITMK), Trivandrum, Kerala, India
Sabu M. Thampi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Jiang, F., Tang, M., Tran, Q.A. (2016). User Preference-Based Spamming Detection with Coupled Behavioral Analysis. In: Wang, G., Ray, I., Alcaraz Calero, J., Thampi, S. (eds) Security, Privacy, and Anonymity in Computation, Communication, and Storage. SpaCCS 2016. Lecture Notes in Computer Science(), vol 10066. Springer, Cham. https://doi.org/10.1007/978-3-319-49148-6_38

Download citation

DOI: https://doi.org/10.1007/978-3-319-49148-6_38
Published: 10 November 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-49147-9
Online ISBN: 978-3-319-49148-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics