Entropy in Network Community as an Indicator of Language Structure in Emoji Usage: A Twitter Study Across Various Thematic Datasets

  • Ryan Hartman
  • S. M. Mahdi Seyednezhad
  • Diego Pinheiro
  • Josemar Faustino
  • Ronaldo MenezesEmail author
Conference paper
Part of the Studies in Computational Intelligence book series (SCI, volume 812)


Emojis are emerging as an alternative way to interact and communicate online, and their large-scale adoption has the potential to reveal distinct patterns of human communication and social interactions. In this work, we investigate the hypothesis that emojis are a kind of language. By building networks of emoji co-occurrence, we examine the diversity of the community structure of such networks with regards to predefined categories of emojis. Using four different techniques of community detection, we validate our hypothesis on six Twitter datasets: five from specific topics and one random dataset. Our results demonstrate that the community structure of emojis is more diverse when they are used in non-random topics such as politics and sports, and that Stochastic Block Models appears to extract communities with higher diversity.



Diego Pinheiro and Josemar Faustino would like to thank the Science Without Borders program (CAPES, Brazil) for financial support under grants 0624/14-4 and 1043-14-5, respectively. This material is based upon work supported by the National Science Foundation under Grant No. CNS 09-23050.


  1. 1.
    Barbieri, F., Ronzano, F., Saggion, H.: What does this emoji mean? a vector space skip-gram model for twitter emojis. In: Language Resources and Evaluation Conference. LREC, Portoroz, Slovenia (2016)Google Scholar
  2. 2.
    Doble, A.: UK’s fastest growing language is... emoji. (2015)
  3. 3.
    Dugué, N., Perez, A.: Directed Louvain: maximizing modularity in directed networks. Technical Report, Université d’Orléans. (2015)
  4. 4.
    Fede, H., Herrera, I., Seyednezhad, S.M., Menezes, R.: Representing emoji usage using directed networks: a twitter case study. In: International Workshop on Complex Networks and their Applications, pp. 829–842. Springer (2017).
  5. 5.
    Fortunato, S.: Community detection in graphs. Phys. Rep. 486(3–5), 75–174 (2010).
  6. 6.
    Gaiteri, C., Chen, M., Szymanski, B., Kuzmin, K., Xie, J., Lee, C., Blanche, T., Chaibub Neto, E., Huang, S.C., Grabowski, T., Madhyastha, T., Komashko, V.: Identifying robust communities and multi-community nodes by combining top-down and bottom-up approaches to clustering. Sci. Rep. 5(1), 16,361 (2015).
  7. 7.
    Gottke, J.: Instagram emoji study: emojis lead to higher interactions. (2017)
  8. 8.
    Hartman, R., Faustino, J., Pinheiro, D., Menezes, R.: Assessing the suitability of network community detection to available meta-data using rank stability, vol. 17, pp. 162–169. ACM Press, New York, New York, USA (2017).
  9. 9.
    Kalimeri, M., Constantoudis, V., Papadimitriou, C., Karamanos, K., Diakonos, F.K., Papageorgiou, H.: Word-length entropies and correlations of natural language written texts. J. Quant. Linguist. 22(2), 101–118 (2015)Google Scholar
  10. 10.
    Lancichinetti, A., Fortunato, S.: Community detection algorithms: a comparative analysis. Phys. Rev. E 80(5), 056,117 (2009).
  11. 11.
    Le-Hong, P., Roussanaly, A., Nguyen, T.M.H., Rossignol, M.: An empirical study of maximum entropy approach for part-of-speech tagging of vietnamese texts. In: Traitement Automatique des Langues Naturelles-TALN 2010, p. 12. (2010)
  12. 12.
    Levitin, L.B., Reingold, Z.: Entropy of natural languages: theory and experiment. Chaos Solitons Fract. 4(5), 709–743 (1994).
  13. 13.
    Lu, X., Ai, W., Liu, X., Li, Q., Wang, N., Huang, G., Mei, Q.: Learning from the ubiquitous language: an empirical analysis of emoji usage of smartphone users. In: Proceedings of the 2016 ACM International Joint Conference on Pervasive and Ubiquitous Computing, pp. 770–780. ACM (2016).
  14. 14.
    Montemurro, M.A., Zanette, D.H.: Universal entropy of word ordering across linguistic families. PLoS One 6(5), e19,875 (2011)Google Scholar
  15. 15.
    Novak, P.K., Smailović, J., Sluban, B., Mozetič, I.: Sentiment of emojis. PloS One 10(12), e0144,296 (2015)Google Scholar
  16. 16.
    Peixoto, T.P.: Hierarchical block structures and high-resolution model selection in large networks. Phys. Rev. X 4(1), 1–18 (2014).
  17. 17.
    Rosvall, M., Bergstrom, C.T.: Maps of random walks on complex networks reveal community structure. Proc. Nat. Acad. Sci. 105(4), 1118–1123 (2008).
  18. 18.
    Seyednezhad, S.M., Menezes, R.: Understanding subject-based emoji usage using network science. In: Workshop on Complex Networks CompleNet, pp. 151–159. Springer (2017).
  19. 19.
    Suárez, A., Palomar, M.: A maximum entropy-based word sense disambiguation system. In: Proceedings of the 19th International Conference on Computational Linguistics, vol. 1, pp. 1–7. Association for Computational Linguistics (2002).
  20. 20.
    Wijeratne, S., Balasuriya, L., Sheth, A., Doran, D.: Emojinet: building a machine readable sense inventory for emoji. In: International Conference on Social Informatics, pp. 527–541. Springer (2016).
  21. 21.
    Winograd, T.: Understanding natural language. Cognit. Psychol. 3(1), 1–191 (1972).,

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Ryan Hartman
    • 1
  • S. M. Mahdi Seyednezhad
    • 1
  • Diego Pinheiro
    • 2
  • Josemar Faustino
    • 1
  • Ronaldo Menezes
    • 3
    Email author
  1. 1.Department of Computer Engineering and SciencesFlorida Institute of TechnologyMelbourneUSA
  2. 2.Department of Internal MedicineUniversity of CaliforniaDavisUSA
  3. 3.BioComplex Laboratory, Department of Computer ScienceUniversity of ExeterExeterUK

Personalised recommendations