Advertisement

Behavioral Analysis of Users for Spammer Detection in a Multiplex Social Network

  • Tahereh PourhabibiEmail author
  • Yee Ling Boo
  • Kok-Leong Ong
  • Booi Kam
  • Xiuzhen Zhang
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 996)

Abstract

There are now a growing number of social networking websites with millions of users, creating a fertile ground for “spammers” to abuse opportunities in these websites for their own gain through constant exposure of malicious communications to other users. The variety of interactions afforded by these social networks has resulted in a Multiplex Network of interactions. In these networks, malicious users evade detection by frequently changing the nature of their activities. This makes it challenging to analyse users’ interactions to capture anomalous behaviours. In this paper, we aimed to detect spammers in a large time-evolving multiplex social network called Tagged.com. For this purpose, we used four different sets of features: (i) a set of light-weight behavioural features to capture the structural behaviour of users in their neighbourhood network; (ii) a set of bursty features and (iii) sequence-based features for capturing the temporal behaviour of users; and (iv) a set of profile-based features which was used as a side information. In addition, we also employed an unsupervised Laplacian Score based approach for feature selection and space dimensionality reduction. The experimental results showed an accuracy of over 88% in spammer detection with a lower empirical time complexity for feature extraction. Implementing behavioural and bursty features in a relational data management system makes the proposed approach more practical since most of the real-world networks store their data in relational databases.

Keywords

Spammer detection Multiplex network Feature extraction Behavioural feature Temporal Laplacian score Un-supervised feature selection 

References

  1. 1.
    Stringhini, G., Kruegel, C., Vigna, G.: Detecting spammers on social networks. In: Proceedings of ACSAC10, USA (2010)Google Scholar
  2. 2.
    Fakhraei, S., Foulds, J., Shashanka, M., Getoor, L.: Collective spammer detection in evolving multi-relational social networks. In: Proceedings of KDD15, Australia, pp 1769–1778. ACM (2015)Google Scholar
  3. 3.
    Agrawal, D., Budak, C., El Abbadi, A., Georgiou, T., Yan, X.: Big data in online social networks: user interaction analysis to model user behavior in social networks. In: Madaan, A., Kikuchi, S., Bhalla, S. (eds.) DNIS 2014. LNCS, vol. 8381, pp. 1–16. Springer, Cham (2014).  https://doi.org/10.1007/978-3-319-05693-7_1CrossRefGoogle Scholar
  4. 4.
    Shehnepoor, S., Salehi, M., Farahbakhsh, R., Crespi, N.: NetSpam: a network-based spam detection framework for reviews in online social media. IEEE Trans. Inf. Forensics Secur. 12, 1585–1595 (2017)CrossRefGoogle Scholar
  5. 5.
    Zheng, X., Zeng, Z., Chen, Z., Yu, Y., Rong, C.: Detecting spammers on social networks. Neurocomputing 159, 27–34 (2015)CrossRefGoogle Scholar
  6. 6.
    Benevenuto, F., Magno, G., Rodrigues, T., Almeida, V.: Detecting spammers on Twitter. In: 7th Annual Collaboration, Electronic Messaging, AntiAbuse and Spam, USA (2010)Google Scholar
  7. 7.
    Wang, A.H.: Don’t follow me: spam detection in Twitter. In: International Conference on Security and Cryptography, Greece (2010)Google Scholar
  8. 8.
    Gao, H., Hu, J., Wilson, C., Li, Z., Chen, Y., Zhao, B.Y.: Detecting and characterizing social spam campaigns. In: Proceedings of IMC 2010, Australia, pp 35–47. ACM (2010)Google Scholar
  9. 9.
    Yang, C., Harkreader, R., Gu, G.: Empirical evaluation and new design for fighting evolving Twitter spammers. IEEE Trans. Inf. Forensics Secur. 8, 1280–1293 (2013)CrossRefGoogle Scholar
  10. 10.
    Hooi, B., Shin, K., Song, H.A., Beutel, A., Shah, N., Faloutsos, C.: Graph-based fraud detection in the face of camouflage. ACM Trans. Knowl. Discov. Data 11, 1–26 (2017)CrossRefGoogle Scholar
  11. 11.
    He, X., Cai, D., Niyogi, P.: Laplacian score for feature selection. In: Proceedings of NIPS 2005, Canada, pp 507–514. MIT Press (2005)Google Scholar
  12. 12.
    Xie, Y., Yu, F., Achan, K., Panigrahy, R., Hulten, G., Osipkov, I.: Spamming botnets: signatures and characteristics. In: Proceedings of SIGCOMM 2008, USA. vol. 38, pp. 171–182. ACM (2008)Google Scholar
  13. 13.
    Liu, T., Li, P., Chen, Y., Zhang, J.: Community size effects on epidemic spreading in multiplex social networks. PLoS One 11, e0152021 (2016)CrossRefGoogle Scholar
  14. 14.
    Schlichtkrull, M., Kipf, T.N., Bloem, P., Berg, R.v.d., Titov, I., Welling, M.: Modeling Relational Data with Graph Convolutional Networks. arXiv preprint arXiv:170306103 (2017)
  15. 15.
    Karim, M.R., Zilles, S.: Robust features for detecting evasive spammers in Twitter. In: Sokolova, M., van Beek, P. (eds.) AI 2014. LNCS (LNAI), vol. 8436, pp. 295–300. Springer, Cham (2014).  https://doi.org/10.1007/978-3-319-06483-3_28CrossRefGoogle Scholar
  16. 16.
    Bhat, S.Y., Abulaish, M.: Community-based features for identifying spammers in online social networks. In: IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining Canada (2013)Google Scholar
  17. 17.
    Yang, C., Harkreader, R.C., Gu, G.: Die free or live hard? empirical evaluation and new design for fighting evolving twitter spammers. In: Sommer, R., Balzarotti, D., Maier, G. (eds.) RAID 2011. LNCS, vol. 6961, pp. 318–337. Springer, Heidelberg (2011).  https://doi.org/10.1007/978-3-642-23644-0_17CrossRefGoogle Scholar
  18. 18.
    Chu, Z., Gianvecchio, S., Wang, H., Jajodia, S.: Detecting automation of Twitter accounts: are you a Human, Bot, or Cyborg? IEEE Trans. Dependable Secur. Comput. 9, 811–824 (2012)CrossRefGoogle Scholar
  19. 19.
    Eom, C.S.-H., Lee, W., Lee, J.J.-H.: Spammer detection for real-time big data graphs. In: Proceedings of EDB 2016, Korea, pp 51–60. ACM (2016)Google Scholar
  20. 20.
    Karsai, M., Jo, H.-H., Kaski, K.: Bursty Human Dynamics. SC. Springer, Cham (2018).  https://doi.org/10.1007/978-3-319-68540-3CrossRefGoogle Scholar
  21. 21.
    García-Pérez, G., Boguñá, M., Serrano, M.Á.: Regulation of burstiness by network-driven activation. Sci. Rep. 5, 9714 (2015)CrossRefGoogle Scholar
  22. 22.
    Cresci, S., Di Pietro, R., Petrocchi, M., Spognardi, A., Tesconi, M.: Fame for sale: Efficient detection of fake Twitter followers. Decis. Support Syst. 80, 56–71 (2015)CrossRefGoogle Scholar
  23. 23.
    Bindu, P.V., Mishra, R., Thilagam, P.S.: Discovering spammer communities in Twitter. J. Intell. Inf. Syst. 1–25 (2018)Google Scholar
  24. 24.
    Jiang, M., Cui, P., Beutel, A., Faloutsos, C., Yang, S.: Catching synchronized behaviors in large networks: a graph mining approach. ACM Trans. Knowl. Discov. Data 10, 1–27 (2016)CrossRefGoogle Scholar
  25. 25.
    Kariin, S., Burge, C.: Dinucleotide relative abundance extremes: a genomic signature. Trends Genet. 11, 283–290 (1995)CrossRefGoogle Scholar
  26. 26.
    Dy, J.G., Brodley, C.E.: Feature selection for unsupervised learning. J. Mach. Learn. Res. 5, 845–889 (2004)MathSciNetzbMATHGoogle Scholar
  27. 27.
    Pourhabibi, T., Imani, M.B., Haratizadeh, S.: Feature selection on Persian fonts: a comparative analysis on GAA, GESA and GA. Procedia Comput. Sci. 3, 1249–1255 (2011)CrossRefGoogle Scholar
  28. 28.
    Zhu, L., Miao, L., Zhang, D.: Iterative laplacian score for feature selection. In: Liu, C.-L., Zhang, C., Wang, L. (eds.) CCPR 2012. CCIS, vol. 321, pp. 80–87. Springer, Heidelberg (2012).  https://doi.org/10.1007/978-3-642-33506-8_11CrossRefGoogle Scholar
  29. 29.
    Enache, A.-C., Sgârciu, V.: An improved bat algorithm driven by support vector machines for intrusion detection. In: Herrero Á., Baruque B., Sedano J., Quintián H., Corchado, E. (eds.) International Joint Conference. CISIS 2015. Advances in Intelligent Systems and Computing. International Joint Conference, vol. 369, pp. 41–51. Springer, Cham (2015).  https://doi.org/10.1007/978-3-319-19713-5_4Google Scholar
  30. 30.
    Perera, B.K.: A class imbalance learning approach to fraud detection in online advertising. Masdar Institute of Science and Technology (2013)Google Scholar
  31. 31.
    Page, L., Brin, S., Motwani, R., Winograd, T.: The PageRank Citation Ranking: Bringing Order to the Web. Stanford InfoLab (1999)Google Scholar
  32. 32.
    Jensen, T.R., Toft, B.: Graph coloring problems. Wiley, New York (2011)zbMATHGoogle Scholar
  33. 33.
    Pemmaraju, S., Skiena, S.: Implementing Discrete Mathematics: Combinatorics and Graph Theory with Mathematica. Addison-Wesley Longman, Boston (1990)zbMATHGoogle Scholar
  34. 34.
    Polak, A.: Counting triangles in large graphs on GPU. In: IEEE International Parallel and Distributed Processing Symposium Workshops (2016)Google Scholar
  35. 35.
    Alvarez-Hamelin, J.I., Dall’Asta, L., Barrat, A., Vespignani, A.: Large scale networks fingerprinting and visualization using the k-core decomposition. In: Proceedings of NIPS 2005 Canada, pp 41–50. MIT Press (2005)Google Scholar
  36. 36.
    Zheng, F., Webb, G.I.: Tree augmented naive bayes. In: Sammut, C., Webb, G.I. (eds.) Encyclopedia of Machine Learning, pp. 990–991. Springer, USA (2010).  https://doi.org/10.1007/978-0-387-30164-8CrossRefGoogle Scholar
  37. 37.
    Liu, Z., Wang, C., Zou, Q., Wang, H.: Clustering coefficient queries on massive dynamic social networks. In: Chen, L., Tang, C., Yang, J., Gao, Y. (eds.) WAIM 2010. LNCS, vol. 6184, pp. 115–126. Springer, Heidelberg (2010).  https://doi.org/10.1007/978-3-642-14246-8_14CrossRefGoogle Scholar
  38. 38.
    Jindal, A., Madden, S., Castellanos, M., Hsu, M.: Graph analytics using vertica relational database. In: IEEE International Conference on Big Data, pp 1191–1200 (2015)Google Scholar
  39. 39.
    Saito, T., Rehmsmeier, M.: The precision-recall plot is more informative than the ROC plot when evaluating binary classifiers on imbalanced datasets. PLoS One 10, e0118432 (2015)CrossRefGoogle Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2019

Authors and Affiliations

  • Tahereh Pourhabibi
    • 1
    Email author
  • Yee Ling Boo
    • 1
  • Kok-Leong Ong
    • 2
  • Booi Kam
    • 1
  • Xiuzhen Zhang
    • 1
  1. 1.RMIT UniversityMelbourneAustralia
  2. 2.Latrobe UniversityMelbourneAustralia

Personalised recommendations