Detecting Marionette Microblog Users for Improved Information Credibility

Conference paper

pp 483–498
Cite this conference paper

Machine Learning and Knowledge Discovery in Databases (ECML PKDD 2013)

Xian Wu²³,
Ziming Feng²³,
Wei Fan²⁴,
Jing Gao²⁵ &
…
Yong Yu²³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8190))

Included in the following conference series:

Joint European Conference on Machine Learning and Knowledge Discovery in Databases

6109 Accesses
8 Citations

Abstract

In this paper, we mine a special group of microblog users: the “marionette” users, who are created or employed by backstage “puppeteers”, either through programs or manually. Unlike normal users that access microblogs for information sharing or social communication, the marionette users perform specific tasks to earn financial profits. For example, they follow certain users to increase their “statistical popularity”, or retweet some tweets to amplify their “statistical impact”. The fabricated follower or retweet counts not only mislead normal users to wrong information, but also seriously impair microblog-based applications, such as popular tweets selection and expert finding. In this paper, we study the important problem of detecting marionette users on microblog platforms. This problem is challenging because puppeteers are employing complicated strategies to generate marionette users that present similar behaviors as normal ones. To tackle this challenge, we propose to take into account two types of discriminative information: (1) individual user tweeting behaviors and (2) the social interactions among users. By integrating both information into a semi-supervised probabilistic model, we can effectively distinguish marionette users from normal ones. By applying the proposed model to one of the most popular microblog platform (Sina Weibo) in China, we find that the model can detect marionette users with f-measure close to 0.9. In addition, we propose an application to measure the credibility of retweet counts.

Download to read the full chapter text

Chapter PDF

Similar content being viewed by others

Detecting Marionette Microblog Users for Improved Information Credibility

Article 14 September 2015

Detecting User Preference on Microblog

Chapter © 2013

The power of comments: fostering social interactions in microblog networks

Article 03 June 2016

Keywords

References

Sakaki, T., Okazaki, M., Matsuo, Y.: Earthquake shakes twitter users: real-time event detection by social sensors. In: Proceedings of the 19th International Conference on World Wide Web, WWW 2010, pp. 851–860 (2010)
Google Scholar
Yu, L., Asur, S., Huberman, B.A.: Artificial inflation: The real story of trends in sina weibo (2012), http://www.hpl.hp.com/research/scl/papers/chinatrends/weibospam.pdf
Bollen, J., Mao, H., Zeng, X.J.: Twitter mood predicts the stock market. CoRR abs/1010.3003 (2010)
Google Scholar
Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: Smote: synthetic minority over-sampling technique. J. Artif. Int. Res. 16(1), 321–357 (2002)
MATH Google Scholar
Kang, H., Wang, K., Soukal, D., Behr, F., Zheng, Z.: Large-scale bot detection for search engines. In: Proceedings of the 19th International Conference on World Wide Web, WWW 2010, pp. 501–510 (2010)
Google Scholar
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The weka data mining software: an update. SIGKDD Explor. Newsl. 11(1), 10–18 (2009)
Article Google Scholar
Buehrer, G., Stokes, J.W., Chellapilla, K.: A large-scale study of automated web search traffic. In: Proceedings of the 4th International Workshop on Adversarial Information Retrieval on the Web, AIRWeb 2008, pp. 1–8. ACM (2008)
Google Scholar
Yu, F., Xie, Y., Ke, Q.: Sbotminer: large scale search bot detection. In: Proceedings of the Third ACM International Conference on Web Search and Data Mining, WSDM 2010 (2010)
Google Scholar
Gyöngyi, Z., Garcia-Molina, H., Pedersen, J.: Combating web spam with trustrank. In: Proceedings of the Thirtieth International Conference on Very Large Data Bases, VLDB 2004, vol. 30, pp. 576–587. VLDB Endowment (2004)
Google Scholar
Wu, B., Davison, B.D.: Identifying link farm spam pages. Special interest tracks and posters of the 14th International Conference on World Wide Web, WWW 2005, pp. 820–829 (2005)
Google Scholar
Krishnan, V., Raj, R.: Web spam detection with anti-trust rank. In: AIRWeb 2006, pp. 37–40 (2006)
Google Scholar
Benczur, A.A., Csalogany, K., Sarlos, T., Uher, M., Uher, M.: Spamrank - fully automatic link spam detection. In: Proceedings of the First International Workshop on Adversarial Information Retrieval on the Web, AIRWeb (2005)
Google Scholar
Castillo, C., Mendoza, M., Poblete, B.: Information credibility on twitter. In: Proceedings of the 20th International Conference on World Wide Web, WWW 2011, pp. 675–684 (2011)
Google Scholar
Ghosh, S., Viswanath, B., Kooti, F., Sharma, N.K., Korlam, G., Benevenuto, F., Ganguly, N., Gummadi, K.P.: Understanding and combating link farming in the twitter social network. In: Proceedings of the 21st International Conference on World Wide Web, WWW 2012, pp. 61–70 (2012)
Google Scholar
Wagner, C., Mitter, S., Körner, C., Strohmaier, M.: When social bots attack: Modeling susceptibility of users in online social networks. In: 2nd Workshop on Making Sense of Microposts at WWW 2012 (2012)
Google Scholar
Silvia Mitter, C.W., Strohmaier, M.: Understanding the impact of socialbot attacks in online social networks. In: WebSci, pp. 15–23 (2013)
Google Scholar
Boshmaf, Y., Muslukhov, I., Beznosov, K., Ripeanu, M.: Design and analysis of a social botnet. Comput. Netw. 57(2), 556–578 (2013)
Article Google Scholar
Reddy, R.N., Kumar, N.: Automatic detection of fake profiles in online social networks (2012), http://ethesis.nitrkl.ac.in/3578/1/thesis.pdf
Yang, C., Harkreader, R., Zhang, J., Shin, S., Gu, G.: Analyzing spammers’ social networks for fun and profit: a case study of cyber criminal ecosystem on twitter. In: Proceedings of the 21st International Conference on World Wide Web, WWW 2012, pp. 71–80 (2012)
Google Scholar
Zhu, Y., Wang, X., Zhong, E., Liu, N.N., Li, H., Yang, Q.: Discovering spammers in social networks. In: AAAI (2012)
Google Scholar

Download references

Author information

Authors and Affiliations

Shanghai Jiao Tong University, Shanghai, 200240, P.R. China
Xian Wu, Ziming Feng & Yong Yu
Huawei Noah’s Ark Lab, Hong Kong
Wei Fan
University at Buffalo, NY, 14260, USA
Jing Gao

Authors

Xian Wu
View author publications
You can also search for this author in PubMed Google Scholar
Ziming Feng
View author publications
You can also search for this author in PubMed Google Scholar
Wei Fan
View author publications
You can also search for this author in PubMed Google Scholar
Jing Gao
View author publications
You can also search for this author in PubMed Google Scholar
Yong Yu
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, Katholieke Universiteit Leuven, Celestijnenlaan 200A, 3001, Leuven, Belgium
Hendrik Blockeel
Fraunhofer IAIS, Department of Knowledge Discovery, Schloss Birlinghoven, University of Bonn, 53754, Sankt Augustin, Germany
Kristian Kersting
LIACS, Universiteit Leiden, Niels Bohrweg 1, 2333 CA, Leiden, The Netherlands
Siegfried Nijssen
Department of Computer Science and Engineering, Czech Technical University, Technicka 2, 16627, Prague 6, Czech Republic
Filip Železný

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Wu, X., Feng, Z., Fan, W., Gao, J., Yu, Y. (2013). Detecting Marionette Microblog Users for Improved Information Credibility. In: Blockeel, H., Kersting, K., Nijssen, S., Železný, F. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2013. Lecture Notes in Computer Science(), vol 8190. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40994-3_31

Download citation

DOI: https://doi.org/10.1007/978-3-642-40994-3_31
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-40993-6
Online ISBN: 978-3-642-40994-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics