ENWalk: Learning Network Features for Spam Detection in Twitter

Santosh, K. C.; Maity, Suman Kalyan; Mukherjee, Arjun

doi:10.1007/978-3-319-60240-0_11

ENWalk: Learning Network Features for Spam Detection in Twitter

K. C. Santosh¹⁷,
Suman Kalyan Maity¹⁸ &
Arjun Mukherjee¹⁷

Conference paper
First Online: 15 June 2017

1989 Accesses
3 Citations

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10354))

Abstract

Social medias are increasing their influence with the vast public information leading to their active use for marketing by the companies and organizations. Such marketing promotions are difficult to identify unlike the traditional medias like TV and newspaper. So, it is very much important to identify the promoters in the social media. Although, there are active ongoing researches, existing approaches are far from solving the problem. To identify such imposters, it is very much important to understand their strategies of social circle creation and dynamics of content posting. Are there any specific spammer types? How successful are each types? We analyze these questions in the light of social relationships in Twitter. Our analyses discover two types of spammers and their relationships with the dynamics of content posts. Our results discover novel dynamics of spamming which are intuitive and arguable. We propose ENWalk, a framework to detect the spammers by learning the feature representations of the users in the social media. We learn the feature representations using the random walks biased on the spam dynamics. Experimental results on large-scale twitter network and the corresponding tweets show the effectiveness of our approach that outperforms the existing approaches.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

1.
http://theatln.tc/2m8g3eA.
2.
http://bzfd.it/2m8rlja.
3.
http://bit.ly/2kJiMKu.
4.
http://bit.ly/1ViorHd, http://53eig.ht/2kzrhfL.

References

Benevenuto, F., Magno, G., Rodrigues, T., Almeida, V.: Detecting spammers on twitter. In: Collaboration, Electronic Messaging, Anti-abuse and Spam Conference (CEAS), vol. 6, p. 12 (2010)
Google Scholar
Fei, G., Mukherjee, A., Liu, B., Hsu, M., Castellanos, M., Ghosh, R.: Exploiting burstiness in reviews for review spammer detection. In: Proceedings of the Seventh International Conference on Weblogs and Social Media, ICWSM 2013, Cambridge, Massachusetts, USA, 8–11 July 2013
Google Scholar
Gao, H., Hu, J., Wilson, C., Li, Z., Chen, Y., Zhao, B.Y.: Detecting and characterizing social spam campaigns. In: Proceedings of the 10th ACM SIGCOMM Conference on Internet Measurement, pp. 35–47 (2010)
Google Scholar
Ghosh, S., Viswanath, B., Kooti, F., Sharma, N.K., Korlam, G., Benevenuto, F., Ganguly, N., Gummadi, K.P.: Understanding and combating link farming in the twitter social network. In: Proceedings of the 21st International Conference on World Wide Web, pp. 61–70 (2012)
Google Scholar
Grover, A., Leskovec, J.: node2vec: scalable feature learning for networks. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 855–864 (2016)
Google Scholar
Hu, X., Tang, J., Zhang, Y., Liu, H.: Social spammer detection in microblogging. IJCAI 2013, 2633–2639 (2013)
Google Scholar
K C, S., Mukherjee, A.: On the temporal dynamics of opinion spamming: case studies on yelp. In: 25th International World Wide Web Conference, WWW 2016, Montréal, Québec, Canada, 11–15 April 2016
Google Scholar
Kwak, H., Lee, C., Park, H., Moon, S.: What is Twitter, a social network or a news media? In: The International World Wide Web Conference Committee (IW3C2), pp. 1–10 (2010)
Google Scholar
Lee, K., Caverlee, J., Webb, S.: Uncovering social spammers: social honeypots+machine learning. In: Proceedings of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 435–442 (2010)
Google Scholar
Li, H., Mukherjee, A., Liu, B., Kornfield, R., Emery, S.: Detecting campaign promoters on twitter using markov random fields. In: 2014 IEEE International Conference on Data Mining, ICDM, Shenzhen, China, pp. 290–299, 14–17 December 2014
Google Scholar
Mikolov, T., Chen, K., Corrado, G. and Dean, J.: Distributed representations of words and phrases and their compositionality. Nips, pp. 1–9 (2013)
Google Scholar
Mikolov, T., Corrado, G., Chen, K., Dean, J.: Efficient estimation of word representations in vector space. In: Proceedings of the International Conference on Learning Representations (ICLR 2013), pp. 1–12 (2013)
Google Scholar
Perozzi, B., Al-Rfou, R., Skiena, S.: Deepwalk: online learning of social representations. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 701–710 (2014)
Google Scholar
Stringhini, G., Kruegel, C., Vigna, G.: Detecting spammers on social networks. In: Proceedings of the 26th Annual Computer Security Applications Conference, pp. 1–9 (2010)
Google Scholar
Tang, J., Qu, M., Wang, M., Zhang, M., Yan, J., Mei, Q.: Line: large-scale information network embedding. In: Proceedings of the 24th International Conference on World Wide Web, pp. 1067–1077 (2015)
Google Scholar
Thomas, K., Grier, C., Song, D., Paxson, V.: Suspended accounts in retrospect: an analysis of twitter spam. In: Proceedings of the 2011 ACM SIGCOMM Conference on Internet Measurement Conference, pp. 243–258 (2011)
Google Scholar
Weng, J., Lim, E.P., Jiang, J., He, Q.: Twitterrank: finding topic-sensitive influential twitterers. In: Proceedings of the 3rd ACM International Conference on Web Search and Data Mining (WSDM 2010), pp. 261–270 (2010)
Google Scholar
Yang, J., Leskovec, J.: Patterns of temporal variation in online media. WSDM 2011, 177 (2011)
Google Scholar
Zhang, X., Zhu, S., Liang, W.: Detecting spam and promoting campaigns in the Twitter social network. In: Proceedings - IEEE International Conference on Data Mining, ICDM, pp. 1194–1199 (2012)
Google Scholar

Download references

Acknowledgements

This work is supported in part by NSF 1527364. We also thank anonymous reviewers for their helpful feedbacks.

Author information

Authors and Affiliations

University of Houston, Houston, USA
K. C. Santosh & Arjun Mukherjee
IIT Kharagpur, Kharagpur, India
Suman Kalyan Maity

Authors

K. C. Santosh
View author publications
You can also search for this author in PubMed Google Scholar
Suman Kalyan Maity
View author publications
You can also search for this author in PubMed Google Scholar
Arjun Mukherjee
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to K. C. Santosh .

Editor information

Editors and Affiliations

Penn State University, State College, Pennsylvania, USA
Dongwon Lee
University of Saskatchewan, Saskatoon, Saskatchewan, Canada
Yu-Ru Lin
University of Pittsburgh, Pittsburgh, Pennsylvania, USA
Nathaniel Osgood
United States Military Academy, West Point, New York, USA
Robert Thomson

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Santosh, K.C., Maity, S.K., Mukherjee, A. (2017). ENWalk: Learning Network Features for Spam Detection in Twitter. In: Lee, D., Lin, YR., Osgood, N., Thomson, R. (eds) Social, Cultural, and Behavioral Modeling. SBP-BRiMS 2017. Lecture Notes in Computer Science(), vol 10354. Springer, Cham. https://doi.org/10.1007/978-3-319-60240-0_11

Download citation

DOI: https://doi.org/10.1007/978-3-319-60240-0_11
Published: 15 June 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-60239-4
Online ISBN: 978-3-319-60240-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics