Authorship verification applied to detection of compromised accounts on online social networks

Barbon, Sylvio; Igawa, Rodrigo Augusto; Bogaz Zarpelão, Bruno

doi:10.1007/s11042-016-3899-8

Authorship verification applied to detection of compromised accounts on online social networks

A continuous approach

Published: 05 September 2016

Volume 76, pages 3213–3233, (2017)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Sylvio Barbon Jr¹,
Rodrigo Augusto Igawa¹ &
Bruno Bogaz Zarpelão¹

1073 Accesses
34 Citations
Explore all metrics

Abstract

Compromising legitimate accounts has been the most used strategy to spread malicious content on OSN (Online Social Network). To address this problem, we propose a pure text mining approach to check if an account has been compromised based on its posts content. In the first step, the proposed approach extracts the writing style from the user account. The second step comprehends the k-Nearest Neighbors algorithm (k-NN) to evaluate the post content and identify the user. Finally, Baseline Updating (third step) consists of a continuous updating of the user baseline to support the current trends and seasonality issues of user’s posts. Experiments were carried out using a dataset from Twitter composed by tweets of 1000 users. All the three steps were individually evaluated, and the results show that the developed method is stable and can detect the compromised accounts. An important observation is the Baseline Updating contribution, which leads to an enhancement of accuracy superior of 60 %. Regarding average accuracy, the developed method achieved results over 93 %.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Social media analytics: a survey of techniques, tools and platforms

Article Open access 26 July 2014

Bogdan Batrinca & Philip C. Treleaven

Combating Misinformation by Sharing the Truth: a Study on the Spread of Fact-Checks on Social Media

Article 11 June 2022

Jiexun Li & Xiaohui Chang

Political mud slandering and power dynamics during Indian assembly elections

Article 27 August 2023

Sarah Masud & Tanmoy Charaborty

Notes

References

Aggarwal CC (2014) Data classification: algorithms and applications CRC Press
Argamon S, Šarić M, Stein SS (2003) Style mining of electronic messages for multiple authorship discrimination: first results. In: Proceedings of the ninth ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 475–480
Bahrainian S-A, Dengel A (2013) Sentiment analysis Summarization of twitter data. In: 2013 IEEE 16th International conference on Computational Science and Engineering (CSE). IEEE, pp 227–234
Benevenuto F, Magno G, Rodrigues T, Almeida V (2010) Detecting spammers on twitter. In: Collaboration, electronic messaging, anti-abuse and spam conference (CEAS), vol 6, p 12
Bhat SY, Abulaish M (2013) Community-based features for identifying spammers in online social networks. In: Proceedings of the 2013 IEEE/ACM international conference on advances in social networks analysis and mining. ACM, pp 100–107
Bliss CA, Kloumann IM, Harris KD, Danforth CM, Dodds PS (2012) Twitter reciprocal reply networks exhibit assortativity with respect to happiness. J Comput Sci 3(5):388–397
Article Google Scholar
Brocardo ML, Traore I, Saad S, Woungang I (2013) Authorship verification for short messages using stylometry. In: Computer, Information and Telecommunication Systems (CITS) international conference on. IEEE, pp 1–6
Brocardo ML, Traore I, Woungang I (2014) Authorship verification of e-mail and tweet messages applied for continuous authentication. Journal of Computer and System Sciences pages –
Canales O, Monaco V, Murphy T, Zych E, Stewart J, Castro CTA, Sotoye O, Torres L, Truley G (2011) A stylometry system for authenticating students taking online tests. P. of Student-Faculty Research Day, Ed., CSIS. Pace University
Cao Q, Sirivianos M, Yang X, Pregueiro T (2012) Aiding the detection of fake accounts in large scale social online services. In: Proceedings of the 9th USENIX conference on networked systems design and implementation. USENIX Association, pp 15–15
Chen X, Hao P, Chandramouli R, Subbalakshmi KP (2011) Authorship similarity detection from email messages. In: Machine learning and data mining in pattern recognition. Springer, pp 375–386
Cingiz MÖ, Diri B, Biricik G (2015) Am i typing fresh tweets: detecting up-to-dateness and worth of categorical information in microblogs. Expert Syst Appl 42(12):5256–5263
Article Google Scholar
Corney M, Vel OD, Anderson A, Mohay G (2002) Gender-preferential text mining of e-mail discourse. In: Computer security applications conference proceedings. 18th annual, pp 282–289
Cresci S, Pietro RD, Petrocchi M, Spognardi A, Tesconi M (2014) A fake follower story: improving fake accounts detection on twitter. IIT-CNR, Tech. Rep TR-03
da Silva NFF, Hruschka ER, Hruschka ER (2014) Tweet sentiment analysis with classifier ensembles. Decis Support Syst 66:170–179
Article Google Scholar
Derczynski L, Ritter A, Clark S, Bontcheva K (2013) Twitter part-of-speech tagging for all: overcoming sparse and noisy data
Donais JA, Frost RA, Peelar SM, Roddy RA (2013) Summary: A system for the automated author attribution of text and instant messages. In: Advances in Social Networks Analysis and Mining (ASONAM), 2013 IEEE/ACM international conference on. IEEE, pp 1484–1485
Duda RO, Hart PE, Stork DG (2012) Pattern Classification. Wiley, New York
MATH Google Scholar
Egele M, Stringhini G, Kruegel C, Vigna G (2013) Compa: detecting compromised accounts on social networks. In: NDSS
El Manar El S, Kassou I (2014) Authorship analysis studies: a survey. Int J Comput Appl 86(12)
Fan X, Yuan C (2015) An improved lower bound for bayesian network structure learning. In: AAAI, pp 3526–3532
Fan X, Yuan C, Malone BM (2014) Tightening bounds for Bayesian network structure learning. In: AAAI, pp 2439–2445
Fersini E, Messina E, Pozzi FA (2014) Sentiment analysis Bayesian ensemble learning. Decis Support Syst 68:26–38
Article Google Scholar
Fong S, Zhuang Y, He J (2012) Not every friend on a social network can be trusted: classifying imposters using decision trees. In: 2012 International conference on future generation communication technology (FGCT), pp 58–63
Gao H, Hu J, Wilson C, Li Z, Chen Y, Zhao BY (2010) Detecting and characterizing social spam campaigns. In: Proceedings of the 10th ACM SIGCOMM conference on internet measurement. ACM, pp 35–47
Grier C, Thomas K, Paxson V, Zhang M (2010) @ spam: the underground on 140 characters or less. In: Proceedings of the 17th ACM conference on computer and communications security. ACM, pp 27–37
Hadjidj R, Debbabi M, Lounis H, Iqbal F, Szporer A, Benredjem D (2009) Towards an integrated e-mail forensic analysis framework. Digit Investig 5 (3):124–137
Article Google Scholar
Hassan A, Abbasi A, Zeng D (2013) Twitter sentiment analysis: a bootstrap ensemble framework. In: 2013 International conference on social computing (SocialCom). IEEE, pp 357–364
Hogenboom A, Frasincar F, Jong FD, Kaymak U (2015) Polarity classification using structure-based vector representations of text. Decis Support Syst 74:46–56
Article Google Scholar
Hsieh L-C, Lee C-W, Chiu T-H, Hsu W (2012) Live semantic sport highlight detection based on analyzing tweets of twitter. In: 2012 IEEE international conference on multimedia and expo (ICME). IEEE, pp 949–954
Igawa RA, Barbon Jr S, Paulo KCS, Kido GS, Guido RC, Júnior MLP, da Silva IN (2016) Account classification in online social networks with lbca and wavelets. Inf Sci 332:72–83
Article Google Scholar
Igawa RA, de Almeida AMG, Zarpelao BB, Barbon Jr S (2015) Recognition of compromised accounts on twitter. In: Proceedings of the annual conference on Brazilian symposium on information systems: information systems: a computer socio-technical perspective. SBSI 2015, vol 1. Brazilian Computer Society, Porto Alegre, Brazil, Brazil, pp 2:9–2:14
Iqbal F, Binsalleeh H, Fung BCM, Debbabi M (2010) Mining writeprints from anonymous e-mails for forensic investigation. Digit Investig 7(1):56–64
Article Google Scholar
Iqbal F, Binsalleeh H, Fung BCM, Debbabi M (2013) A unified data mining solution for authorship analysis in anonymous textual communications. Inf Sci 231:98–112
Article Google Scholar
Iqbal F, Hadjidj R, Fung BCM, Debbabi M (2008) A novel approach of mining write-prints for authorship attribution in e-mail forensics. Digit Investig 5:S42–S51
Article Google Scholar
Iqbal F, Khan LA, Fung B, Debbabi M (2010) E-mail authorship verification for forensic investigation. In: Proceedings of the ACM symposium on applied computing. ACM, pp 1591–1598
Jiang M, Cui P, Beutel A, Faloutsos C, Yang S (2014) Detecting suspicious following behavior in multimillion-node social networks. In: Proceedings of the companion publication of the 23rd international conference on world wide web companion. International World Wide Web Conferences Steering Committee, pp 305–306
Keretna S, Hossny A, Creighton D (2013) Recognising user identity in twitter social networks via text mining. In: 2013 IEEE International conference on systems, man, and cybernetics (SMC). IEEE, pp 3079–3082
Koppel M, Argamon S, Shimoni AR (2002) Automatically categorizing written texts by author gender. Literary Linguistic Comput 17(4):401–412
Article Google Scholar
Koppel M, Schler J (2004) Authorship verification as a one-class classification problem. In: Proceedings of the twenty-first international conference on machine learning. ACM, p 62
Koppel M, Schler J, Argamon S (2009) Computational methods in authorship attribution. J Am Soc Inf Sci Technol 60(1):9–26
Article Google Scholar
Kucukyilmaz T, Barla Cambazoglu B, Aykanat C, Can F (2008) Chat mining: predicting user and message attributes in computer-mediated communication. Inf Process Manag 44(4):1448–1466
Article Google Scholar
Layton R, Watters P, Dazeley R (2010) Authorship attribution for twitter in 140 characters or less. In: 2010 Second cybercrime and trustworthy computing workshop (CTC). IEEE, pp 1–8
Lee K, Caverlee J, Webb S (2010) Uncovering social spammers: social honeypots + machine learning. In: Proceedings of the 33rd international ACM SIGIR conference on research and development in information retrieval. ACM, pp 435–442
Li R, Wang S, Deng H, Wang R, Chang K C-C (2012) Towards social user profiling: unified and discriminative influence model for inferring home locations. In: KDD, pp 1023–1031
Li X, Wang M, Liang T-P (2014) A multi-theoretical kernel-based approach to social network-based recommendation. Decis Support Syst 65:95–104
Article Google Scholar
Liao H-Y, Chen K-Y, Liu D-R (2015) Virtual friend recommendations in virtual worlds. Decis Support Syst 69:59–69
Article Google Scholar
Liu Z, Yang Z, Liu S, Shi Y (2013) Semi-random subspace method for writeprint identification. Neurocomputing 108:93–102
Article Google Scholar
Lumezanu C, Feamster N (2012) Observing common spam in tweets and email. In: Proc. IMC. Citeseer
Martinez-Romo J, Araujo L (2013) Detecting malicious tweets in trending topics using a statistical analysis of language. Expert Syst Appl 40(8):2992–3000
Article Google Scholar
Mostafa MM (2013) More than words: social networks text mining for consumer brand sentiments. Expert Syst Appl 40(10):4241–4251
Article Google Scholar
Neme A, Pulido JRG, Muoz A, Hernn̈dez S, Dey T (2015) Stylistics analysis and authorship attribution algorithms based on self-organizing maps. Neurocomputing 147:147–159. Advances in self-organizing maps subtitle of the special issue: selected papers from the workshop on self-organizing maps 2012 (WSOM 2012)
Article Google Scholar
Potha N, Stamatatos E (2014) A profile-based method for authorship verification. In: Likas A, Blekas K, Kalles D (eds) Artificial intelligence: methods and applications, volume 8445 of lecture notes in computer science, pp 313–326. Springer International Publishing
Qian T, Liu B, Li C, Peng Z, Zhong M, He G, Li X, Gang X (2015) Tri-training for authorship attribution with limited training data: a comprehensive study. Neurocomputing pages –
Ramezani R, Sheydaei N, Kahani M (2013) Evaluating the effects of textual features on authorship attribution accuracy. In: 2013 3th International eConference on computer and knowledge engineering (ICCKE). IEEE, pp 108–113
Santos I, Miñambres-Marcos I, Laorden C, Galán-García P, Santamaría-Ibirika A, Bringas P (2014) Twitter content-based spam filtering. In: International Joint Conference SOCO13-CISIS13-ICEUTE13. Springer, pp 449–458
Smailović J, Grčar M, Lavrač N, žnidaršič M (2014) Stream-based active learning for sentiment analysis in the financial domain. Information Sciences
Song J, Lee S, Kim J (2011) Spam filtering in twitter using sender-receiver relationship. In: Recent advances in intrusion detection. Springer, pp 301–317
Stein T, Chen E, Mangla K (2011) Facebook immune system. In: Proceedings of the 4th workshop on social network systems. ACM, p 8
Sun J, Yang Z, Wang P, Liu S (2010) Variable length character n-gram approach for online writeprint identification. In: International conference on multimedia information networking and security (MINES). IEEE, pp 486–490
Theodoridis S, Pikrakis A, Koutroumbas K, Cavouras D (2010) Introduction to pattern recognition: a Matlab approach: a Matlab approach. Academic Press
Weathers D, Swain SD, Grover V (2015) Can online product reviews be more helpful? Examining characteristics of information content by product type. Decis Support Syst 79:12–23
Article Google Scholar
Yu SJ (2012) The dynamic competitive recommendation algorithm in social network services. Inf Sci 187:1–14
Article Google Scholar
Zadeh AH, Sharda R (2014) Modeling brand post popularity dynamics in online social networks. Decis Support Syst 65:59–68
Article Google Scholar
Zangerle E, Specht G (2014) Sorry, I was hacked: a classification of compromised twitter accounts. In: Proceedings of the 29th annual ACM symposium on applied computing. ACM, pp 587–593
Zappavigna M (2011) Ambient affiliation: a linguistic perspective on twitter. New Media Soc 13(5): 788–806
Article Google Scholar
Zhang C, Xindong W, Niu Z, Ding W (2014) Authorship identification from unstructured texts Knowledge-based systems
Zhang Z, Wang K (2013) A trust model for multimedia social networks. Soc Netw Anal Min 3(4): 969–979
Article Google Scholar
Zhang Z, Liu Y, Ding W, Huang WW, Qin S, Chen P (2015) Proposing a new friend recommendation method, frutai, to enhance social media providers’ performance. Decis Support Syst 79:46–54
Article Google Scholar
Zhou X, Sai W, Chen C, Chen G, Ying S (2014) Real-time recommendation for microblogs. Inf Sci 279:301–325
Article Google Scholar

Download references

Author information

Authors and Affiliations

Londrina State University, Londrina, Brazil
Sylvio Barbon Jr, Rodrigo Augusto Igawa & Bruno Bogaz Zarpelão

Authors

Sylvio Barbon Jr
View author publications
You can also search for this author in PubMed Google Scholar
Rodrigo Augusto Igawa
View author publications
You can also search for this author in PubMed Google Scholar
Bruno Bogaz Zarpelão
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sylvio Barbon Jr.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Barbon, S., Igawa, R.A. & Bogaz Zarpelão, B. Authorship verification applied to detection of compromised accounts on online social networks. Multimed Tools Appl 76, 3213–3233 (2017). https://doi.org/10.1007/s11042-016-3899-8

Download citation

Received: 15 January 2016
Revised: 15 August 2016
Accepted: 24 August 2016
Published: 05 September 2016
Issue Date: February 2017
DOI: https://doi.org/10.1007/s11042-016-3899-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Authorship verification applied to detection of compromised accounts on online social networks

Abstract

Access this article

Similar content being viewed by others

Social media analytics: a survey of techniques, tools and platforms

Combating Misinformation by Sharing the Truth: a Study on the Spread of Fact-Checks on Social Media

Political mud slandering and power dynamics during Indian assembly elections

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Authorship verification applied to detection of compromised accounts on online social networks

Abstract

Access this article

Similar content being viewed by others

Social media analytics: a survey of techniques, tools and platforms

Combating Misinformation by Sharing the Truth: a Study on the Spread of Fact-Checks on Social Media

Political mud slandering and power dynamics during Indian assembly elections

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation