Advertisement

Bot prediction on social networks of Twitter in altmetrics using deep graph convolutional networks

  • 7 Accesses

Abstract

In the context of smart cities, it is crucial to filter out falsified information spread on social media channels through paid campaigns or bot-user accounts that significantly influence communication networks across the social communities and may affect smart decision-making by the citizens. In this paper, we focus on two major aspects of the Twitter social network associated with altmetrics: (a) to analyze the properties of bots on Twitter networks and (b) to distinguish between bots and human accounts. Firstly, we employed state-of-the-art social network analysis techniques that exploit Twitter’s social network properties in novel altmetrics data. We found that 87% of tweets are affected by bots that are involved in the network’s dominant communities. We also found that, to some extent, community size and the degree of distribution in Twitter’s altmetrics network follow a power-law distribution. Furthermore, we applied a deep learning model, graph convolutional networks, to distinguish between organic (human) and bot Twitter accounts. The deployed model achieved the promising results, providing up to 71% classification accuracy over 200 epochs. Overall, the study concludes that bot presence in altmetrics-associated social media platforms can artificially inflate the number of social usage counts. As a result, special attention is required to eliminate such discrepancies when using altmetrics data for smart decision-making, such as research assessment either independently or complementary along with traditional bibliometric indices.

This is a preview of subscription content, log in to check access.

Access options

Buy single article

Instant unlimited access to the full article PDF.

US$ 39.95

Price includes VAT for USA

Subscribe to journal

Immediate online access to all issues from 2019. Subscription will auto renew annually.

US$ 99

This is the net price. Taxes to be calculated in checkout.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

References

  1. Abokhodair N, Yoo D, McDonald DW (2015) Dissecting a social botnet: growth, content and influence in Twitter. In: Proceedings of the 18th ACM conference on computer supported cooperative work & social computing. ACM, pp 839–851. https://doi.org/10.1145/2675133.2675208

  2. Adamic LA, Huberman BA (2000) Power-law distribution of the world wide web. Science 287(5461):2115–2115. https://doi.org/10.1126/science.287.5461.2115a

  3. Ala’M AZ, Faris H, Alqatawna JF, Hassonah MA (2018) Evolving support vector machines using whale optimization algorithm for spam profiles detection on online social networks in different lingual contexts. Knowl-Based Syst 153:91–104. https://doi.org/10.1016/j.knosys.2018.04.025

  4. Alarifi A, Alsaleh M, Al-Salman A (2016) Twitter turing test: identifying social machines. Inf Sci 372:332–346. https://doi.org/10.1016/j.ins.2016.08.036

  5. Al-Janabi S (2018) Smart system to create an optimal higher education environment using IDA and IOTs. Int J Comput Appl. https://doi.org/10.1080/1206212X.2018.1512460

  6. Al-Janabi S, Alkaim AF (2019) A nifty collaborative analysis to predicting a novel tool (DRFLLS) for missing values estimation. Soft Comput. https://doi.org/10.1007/s00500-019-03972-x

  7. Al-Janabi S, Hussein NY (2019) The reality and future of the secure mobile cloud computing (SMCC): survey. In: International conference on big data and networks technologies. Springer, Cham, pp 231–261. https://doi.org/10.1007/978-3-030-23672-4_18

  8. Al-Janabi S, Mahdi MA (2019) Evaluation prediction techniques to achievement an optimal biomedical analysis. Int J Grid Util Comput 10(5):512–527. https://doi.org/10.1504/IJGUC.2019.102021

  9. Al-Janabi S, Patel A, Fatlawi HK, Kalajdzic K, Shourbaji IA (2014) Empirical rapid and accurate prediction model for data mining tasks in cloud computing environments. In: 2014 international congress on technology, communication and knowledge (ICTCK), pp 1–8. https://doi.org/10.1109/ICTCK.2014.7033495

  10. Al-Janabi S, Yaqoob A, Mohammad M (2019) Pragmatic method based on intelligent big data analytics to prediction air pollution. In: International conference on big data and networks technologies. Springer, Cham, pp 84–109. https://doi.org/10.1007/978-3-030-23672-4_8

  11. Alkaim AF, Al-Janabi S (2019) Multi objectives optimization to gas flaring reduction from oil production. In: International conference on big data and networks technologies. Springer, Cham, pp 117–139. https://doi.org/10.1007/978-3-030-23672-4_10

  12. Alkhammash EH, Jussila J, Lytras MD, Visvizi A (2019) Annotation of smart cities Twitter micro-contents for enhanced citizen’s engagement. IEEE Access 7:116267–116276. https://doi.org/10.1109/ACCESS.2019.2935186

  13. Alperin JP, Haustein S (2017) Applying social network analysis to explore Twitter diffusion patterns. In: Altmetrics17 workshop. https://altmetrics.org/wp-content/uploads/2017/06/alperin.pdf. Accessed 5 Feb 2019

  14. Alsinet T, Argelich J, Béjar R, Cemeli J (2019) A distributed argumentation algorithm for mining consistent opinions in weighted Twitter discussions. Soft Comput 23(7):2147–2166. https://doi.org/10.1007/s00500-018-3380-x

  15. Alvisi L, Clement A, Epasto A, Lattanzi S, Panconesi A (2013) SoK: the evolution of sybil defense via social networks. In: 2013 IEEE symposium on security and privacy. IEEE, pp 382–396. https://doi.org/10.1109/SP.2013.33

  16. Ananiadou S, Thompson P, Nawaz R (2013) Enhancing search: events and their discourse context. In: International conference on intelligent text processing and computational linguistics. Springer, Berlin, pp 318–334. https://doi.org/10.1007/978-3-642-37256-8_27

  17. Arif MH, Li J, Iqbal M, Liu K (2018) Sentiment analysis and spam detection in short informal text using learning classifier systems. Soft Comput 22(21):7281–7291. https://doi.org/10.1007/s00500-017-2729-x

  18. Batista-Navarro RT, Kontonatsios G, Mihăilă C, Thompson P, Rak R, Nawaz R, Korkontzelos I, Ananiadou S (2013) Facilitating the analysis of discourse phenomena in an interoperable NLP platform. In: International conference on intelligent text processing and computational linguistics. Springer, Berlin, pp 559–571. https://doi.org/10.1007/978-3-642-37247-6_45

  19. Bessi A, Ferrara E (2016) Social bots distort the 2016 US Presidential election online discussion. First Monday, vol 21, no 11-7. Available at SSRN: https://ssrn.com/abstract=2982233. Accessed 5 Feb 2019

  20. Cai C, Li L, Zengi D (2017) Behavior enhanced deep bot detection in social media. In: 2017 IEEE international conference on intelligence and security informatics (ISI). IEEE, pp 128–130. https://doi.org/10.1109/ISI.2017.8004887

  21. Cao Q, Sirivianos M, Yang X, Pregueiro T (2012) Aiding the detection of fake accounts in large scale social online services. In: Proceedings of the 9th USENIX conference on networked systems design and implementation. USENIX Association, pp 15–15. https://dl.acm.org/citation.cfm?id=2228319. Accessed 5 Feb 2019

  22. Chu Z, Gianvecchio S, Wang H, Jajodia S (2010) Who is tweeting on Twitter: human, bot, or cyborg? In: Proceedings of the 26th annual computer security applications conference. ACM, pp 21–30. https://doi.org/10.1109/TDSC.2012.75

  23. Costas R, Zahedi Z, Wouters P (2015) Do “altmetrics” correlate with citations? Extensive comparison of altmetric indicators with citations from a multidisciplinary perspective. J Assoc Inf Sci Technol 66(10):2003–2019. https://doi.org/10.1002/asi.23309

  24. Didegah F, Mejlgaard N, Sørensen M (2018) Investigating the quality of interactions and public engagement around scientific papers on Twitter. J Informetr 12(3):960–971. https://doi.org/10.1016/j.joi.2018.08.002

  25. Edwards C, Edwards A, Spence PR, Shelton AK (2014) Is that a bot running the social media feed? Testing the differences in perceptions of communication quality for a human agent and a bot agent on Twitter. Comput Hum Behav 33:372–376. https://doi.org/10.1016/j.chb.2013.08.013

  26. Elovici Y, Fire M, Herzberg A, Shulman H (2014) Ethical considerations when employing fake identities in online social networks for research. Sci Eng Ethics 20(4):1027–1043. https://doi.org/10.1007/s11948-013-9473-0

  27. Erşahin B, Aktaş Ö, Kılınç D, Akyol C (2017) Twitter fake account detection. In: 2017 international conference on computer science and engineering (UBMK). IEEE, pp 388–392. https://doi.org/10.1109/UBMK.2017.8093420

  28. Ferrara E, Varol O, Davis C, Menczer F, Flammini A (2016) The rise of social bots. Commun ACM 59(7):96–104. https://doi.org/10.1145/2818717

  29. Gilani Z, Wang L, Crowcroft J, Almeida M, Farahbakhsh R (2016) Stweeler: a framework for Twitter bot analysis. In: Proceedings of the 25th international conference companion on world wide web. International World Wide Web Conferences Steering Committee, pp 37–38. https://doi.org/10.1145/2872518.2889360

  30. Gilani Z, Kochmar E, Crowcroft J (2017) Classification of Twitter accounts into automated agents and human users. In: Proceedings of the 2017 IEEE/ACM international conference on advances in social networks analysis and mining 2017. ACM, pp 489–496.https://doi.org/10.1145/3110025.3110091

  31. Gong VX, Yang J, Daamen W, Bozzon A, Hoogendoorn S, Houben GJ (2018) Using social media for attendees density estimation in city-scale events. IEEE Access 6:36325–36340. https://doi.org/10.1109/ACCESS.2018.2845339

  32. Hassan SU, Imran M, Gillani U, Aljohani NR, Bowman TD, Didegah F (2017) Measuring social media activity of scientific literature: an exhaustive comparison of scopus and novel altmetrics big data. Scientometrics 113(2):1037–1057. https://doi.org/10.1007/s11192-017-2512-x

  33. Hassan SU, Bowman TD, Shabbir M, Akhtar A, Imran M, Aljohani NR (2019) Influential tweeters in relation to highly cited articles in altmetric big data. Scientometrics 119(1):481–493. https://doi.org/10.1007/s11192-019-03044-9

  34. Haustein S (2018) Scholarly Twitter metrics. In: Handbook of quantitative science and technology research. https://doi.org/10.1007/978-3-030-02511-3_28

  35. Haustein S, Peters I, Sugimoto CR, Thelwall M, Larivière V (2014) Tweeting biomedicine: an analysis of tweets and citations in the biomedical literature. J Assoc Inf Sci Technol 65(4):656–669. https://doi.org/10.1002/asi.23101

  36. Haustein S, Bowman TD, Holmberg K, Tsou A, Sugimoto CR, Larivière V (2016) Tweets as impact indicators: examining the implications of automated “bot” accounts on Twitter. J Assoc Inf Sci Technol 67(1):232–238. https://doi.org/10.1002/asi.23456

  37. Holmberg K, Vainio J (2018) Why do some research articles receive more online attention and higher altmetrics? Reasons for online success according to the authors. Scientometrics 116(1):435–447. https://doi.org/10.1007/s11192-018-2710-1

  38. Imran M, Akhtar A, Said A, Safder I, Hassan SU, Aljohani NR (2018) Exploiting social networks of Twitter in altmetrics big data. In: 23rd international conference on science and technology indicators (STI 2018), 12–14 Sept 2018, Leiden, The Netherlands. Centre for Science and Technology Studies (CWTS). https://hdl.handle.net/1887/65219

  39. Ismagilova E, Hughes L, Dwivedi YK, Raman KR (2019) Smart cities: advances in research—an information systems perspective. Int J Inf Manage 47:88–100. https://doi.org/10.1016/j.ijinfomgt.2019.01.004

  40. Ismail HM, Belkhouche B, Zaki N (2018) Semantic Twitter sentiment analysis based on a fuzzy thesaurus. Soft Comput 22(18):6011–6024. https://doi.org/10.1007/s00500-017-2994-8

  41. Jahangir M, Afzal H, Ahmed M, Khurshid K, Nawaz R (2017) An expert system for diabetes prediction using auto tuned multi-layer perceptron. In: 2017 intelligent systems conference (IntelliSys). IEEE, pp 722–728.https://doi.org/10.1109/IntelliSys.2017.8324209

  42. Jia J, Wang B, Gong NZ (2017) Random walk based fake account detection in online social networks. In: 2017 47th annual IEEE/IFIP international conference on dependable systems and networks (DSN). IEEE, pp 273–284. https://doi.org/10.1109/DSN.2017.55

  43. Kaghed NH, Abbas TA, Ali SH (2006) Design and implementation of classification system for satellite images based on soft computing techniques. In: 2006 2nd international conference on information & communication technologies, vol 1. IEEE, pp 430–436. https://doi.org/10.1109/ICTTA.2006.1684408

  44. Kalajdzic K, Ali SH, Patel A (2015) Rapid lossless compression of short text messages. Comput Stand Interfaces 37:53–59. https://doi.org/10.1016/j.csi.2014.05.005

  45. Kantepe M, Ganiz MC (2017) Preprocessing framework for Twitter bot detection. In: 2017 International conference on computer science and engineering (UBMK). IEEE, pp 630–634. https://doi.org/10.1109/UBMK.2017.8093483

  46. Kipf TN, Welling M (2017) Semi-supervised classification with graph convolutional networks. In: International conference on learning representations. arXiv:1609.02907. Accessed 5 Feb 2019

  47. Lytras MD, Mathkour H (2017) Advances in research in social networking for open and distributed learning. Int Rev Res Open Distrib Learn 18:1–4

  48. Lytras MD, Visvizi A (2018) Who uses smart city services and what to make of it: Toward interdisciplinary smart cities research. Sustainability 10:1998. https://doi.org/10.3390/su10061998

  49. Lytras MD, Visvizi A (2019) Big data and their social impact: preliminary study. Sustainability 11(18):5067. https://doi.org/10.3390/su11185067

  50. Lytras M, Al-Halabi W, Zhang J, Haraty R, Masud M (2015) Enabling technologies and business infrastructures for next generation social media: big data, cloud computing, internet of things and virtual reality. J Univ Comput Sci 21(11):1379–1384

  51. Lytras MD, Raghavan V, Damiani E (2017) Big data and data analytics research: From metaphors to value space for collective wisdom in human decision making and smart machines. Int J Semant Web Inf Syst (IJSWIS) 13(1):1–10. https://doi.org/10.4018/IJSWIS.2017010101

  52. Lytras M, Visvizi A, Daniela L, Sarirete A, Ordonez De Pablos P (2018) Social networks research for sustainable smart education. Sustainability 10(9):2974. https://doi.org/10.3390/su10092974

  53. Mehrotra A, Sarreddy M, Singh S (2016) Detection of fake Twitter followers using graph centrality measures. In: 2016 2nd international conference on contemporary computing and informatics (IC3I). IEEE, pp 499–504.https://doi.org/10.1109/IC3I.2016.7918016

  54. Nawaz R, Thompson P, Ananiadou S (2012) Identification of manner in bio-events. In: LREC, pp 3505–3510. https://www.lrec-conf.org/proceedings/lrec2012/pdf/818_Paper.pdf. Accessed 5 Feb 2019

  55. Priem J, Costello KL (2010) How and why scholars cite on Twitter. Proc Am Soc Inf Sci Technol 47:1–4. https://doi.org/10.1002/meet.14504701201

  56. Priem JJ, Taraborelli D, Groth P, Neylon C (2010) Altmetrics: a manifesto, 26 Oct 2010. https://altmetrics.org/manifesto

  57. Safder I, Hassan SU (2019) Bibliometric-enhanced information retrieval: a novel deep feature engineering approach for algorithm searching from full-text publications. Scientometrics 119(1):257–277. https://doi.org/10.1007/s11192-019-03025-y

  58. Said A, Abbasi RA, Maqbool O, Daud A, Aljohani NR (2018) CC-GA: a clustering coefficient based genetic algorithm for detecting communities in social networks. Appl Soft Comput 63:59–70. https://doi.org/10.1016/j.asoc.2017.11.014

  59. Said A, Bowman TD, Abbasi RA, Aljohani NR, Hassan SU, Nawaz R (2019) Mining network-level properties of Twitter altmetrics data. Scientometrics. https://doi.org/10.1007/s11192-019-03112-0

  60. Shardlow M, Batista-Navarro R, Thompson P, Nawaz R, McNaught J, Ananiadou S (2018) Identification of research hypotheses and new knowledge from scientific literature. BMC Med Inform Decis Mak 18(1):46. https://doi.org/10.1186/s12911-018-0639-1

  61. Shuai X, Pepe A, Bollen J (2012) How the scientific community reacts to newly submitted preprints: article downloads, Twitter mentions, and citations. PLoS ONE 7(11):e47523. https://doi.org/10.1371/journal.pone.0047523

  62. Stein T, Chen E, Mangla K (2011) Facebook immune system. In: Proceedings of the 4th workshop on social network systems. ACM, p 8. https://research.fb.com/publications/facebook-immune-system/. Accessed 5 Feb 2019

  63. Subrahmanian VS, Azaria A, Durst S, Kagan V, Galstyan A, Lerman K, Menczer F (2016) The DARPA Twitter bot challenge. Computer 49(6):38–46. https://doi.org/10.1109/MC.2016.183

  64. Sugimoto CR, Work S, Larivière V, Haustein S (2017) Scholarly use of social media and altmetrics: a review of the literature. J Assoc Inf Sci Technol 68(9):2037–2062. https://doi.org/10.1002/asi.23833

  65. Thompson P, Nawaz R, McNaught J, Ananiadou S (2017) Enriching news events with meta-knowledge information. Lang Resour Eval 51(2):409–438. https://doi.org/10.1007/s10579-016-9344-9

  66. Varol O, Ferrara E, Davis CA, Menczer F, Flammini A (2017) Online human-bot interactions: detection, estimation, and characterization. In: Eleventh international AAAI conference on web and social media. https://aaai.org/ocs/index.php/ICWSM/ICWSM17/paper/view/15587/14817. Accessed 5 Feb 2019

  67. Visvizi A, Lytras MD (2018) Rescaling and refocusing smart cities research: from mega cities to smart villages. J Sci Technol Policy Manag 9:134–145. https://doi.org/10.1108/JSTPM-02-2018-0020

  68. Visvizi A, Mazzucelli C, Lytras M (2017) Irregular migratory flows: towards an ICTs’ enabled integrated framework for resilient urban systems. J Sci Technol Policy Manag 8:227–242. https://doi.org/10.1108/JSTPM-05-2017-0020

  69. Visvizi A, Lytras MD, Damiani E, Mathkour H (2018) Policy making for smart cities: innovation and social inclusive economic growth for sustainability. J Sci Technol Policy Manag 9:126–133. https://doi.org/10.1108/JSTPM-07-2018-079

  70. Visvizi A, Jussila J, Lytras MD, Ijäs M (2019) Tweeting and mining OECD-related microcontent in the post-truth era: a cloud-based app. Comput Hum Behav. https://doi.org/10.1016/j.chb.2019.03.022

  71. Wu X, Feng Z, Fan W, Gao J, Yu Y (2013) Detecting marionette microblog users for improved information credibility. In: Joint European conference on machine learning and knowledge discovery in databases. Springer, Berlin, pp 483–498.https://doi.org/10.1007/978-3-642-40994-3_31

  72. Yang C, Harkreader RC, Gu G (2011) Die free or live hard? Empirical evaluation and new design for fighting evolving Twitter spammers. In: International workshop on recent advances in intrusion detection. Springer, Berlin, pp 318–337. https://doi.org/10.1007/978-3-642-23644-0_17

  73. Yang KC, Varol O, Davis CA, Ferrara E, Flammini A, Menczer F (2019) Arming the public with artificial intelligence to counter social bots. Hum Behav Emerg Technol 1(1):48–61. https://doi.org/10.1002/hbe2.115

  74. Zahedi Z, Haustein S (2018) On the relationships between bibliographic characteristics of scientific documents and citation and Mendeley readership counts: a large-scale analysis of Web of Science publications. J Informetr 12(1):191–202. https://doi.org/10.1016/j.joi.2017.12.005

  75. Zhang J, Zhang R, Sun J, Zhang Y, Zhang C (2015) Truetop: A sybil-resilient system for user influence measurement on Twitter. IEEE/ACM Trans Netw 24(5):2834–2846. https://doi.org/10.1109/TNET.2015.2494059

  76. Zhuhadar L, Thrasher E, Marklin S, de Pablos PO (2017) The next wave of innovation—review of smart cities intelligent operation systems. Comput Hum Behav 66:273–281. https://doi.org/10.1016/j.chb.2016.09.030

Download references

Author information

Correspondence to Saeed-Ul Hassan.

Ethics declarations

Conflict of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations

Communicated by Miltiadis D. Lytras.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Aljohani, N.R., Fayoumi, A. & Hassan, S. Bot prediction on social networks of Twitter in altmetrics using deep graph convolutional networks. Soft Comput (2020) doi:10.1007/s00500-020-04689-y

Download citation

Keywords

  • Social media
  • Twitter
  • Information spread
  • Smart cities
  • Bots
  • Prediction
  • Altmetrics
  • Deep learning