Skip to main content

Who Tweets in Italian? Demographic Characteristics of Twitter Users

  • Conference paper
  • First Online:
New Statistical Developments in Data Science (SIS 2017)

Part of the book series: Springer Proceedings in Mathematics & Statistics ((PROMS,volume 288))

Included in the following conference series:

  • 1221 Accesses

Abstract

In this paper we try for the first time to shed light on the use of Twitter by the Italian speaking users quantifying the total audience and some relevant characteristics: in particular, gender and location. The attempt is based on publicly available APIs data referring both to profile documents and tweets. Through real-time calculation is possible to infer the gender mainly using the name field of the users’ profile, while the geo-location is deduced using the location field and the geotagged tweets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 99.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 129.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 179.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    According to Alexa.

  2. 2.

    National demographic estimate, January 2016.

  3. 3.

    For further information see [22] and the references therein.

  4. 4.

    https://dev.twitter.com/overview/api/users

  5. 5.

    According to wikipedia, there are 64 million native Italian speakers in the EU and 85 million in the world when in Italy there are 61 million inhabitants. Regarding English, there are 360–400 million native speakers and 600–700 million people that speaks English as a second language.

  6. 6.

    https://dev.twitter.com/overview/terms/policy.html

  7. 7.

    The enterprises whose websites were scraped in the cited study, were the majority (64%) of the enterprises (with 10 employees and over) having a website, but only the half of these enterprises presented links to social media.

  8. 8.

    For example, consider a company named “rossi” and the username “alexRossi”. The username contains the company name but the remaining letters can be interpreted as a male proper name and hence the username is not labelled as a company.

  9. 9.

    i.e. the Italian National Institute of Statistics list of municipalities, containing 7978 Italian municipalities.

  10. 10.

    We tried also to determine the users profession using the bio field, through a list of roughly 1000 professions. Results were absolutely not satisfactory maybe because the bio field is an open field that each user interprets in her own way.

References

  1. Barcaroli, G., Bianchi, G., Nurra, A.: Internet as a data source: Ict use of enterprises: web ordering, job advertising and presence on social media. In: Big Data Committee Annual Report 2017, ISTAT, CIKM ’10. https://www.istat.it/it/files//2018/09/Big-data-committee.pdf (2018)

  2. Bollen, J., Mao, H., Zeng, X.: Twitter mood predicts the stock market. http://arxiv.org/abs/1010.3003 (2010)

  3. Burger, J.D., Henderson, J., Kim, G., Zarrella, G.: Discriminating gender on twitter. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, EMNLP ’11, pp. 1301–1309, Stroudsburg, PA, USA. Association for Computational Linguistics. ISBN 978-1-937284-11-4. http://dl.acm.org/citation.cfm?id=2145432.2145568 (2011)

  4. Censis. 13\(^{\circ }\) rapporto censis-ucsi sulla comunicazione i media tra élite e popolo. http://www.censis.it/17?shadow_pubblicazione=120570 (2016)

  5. Chang, J., Rosenn, I., Backstrom, L., Marlow,C.: Epluribus: Ethnicity on social networks. In: ICWSM (2010)

    Google Scholar 

  6. Cheng, Z., Caverlee, J., Lee, K.: You are where you tweet: a content-based approach to geo-locating twitter users. In: Proceedings of the 19th ACM International Conference on Information and Knowledge Management, CIKM ’10, New York, NY, USA, pp. 759–768. ACM. ISBN 978-1-4503-0099-5. https://doi.org/10.1145/1871437.1871535 (2010)

  7. Chu, Z., Gianvecchio, S., Wang, H., Jajodia, S.: Who is tweeting on twitter: human, bot, or cyborg? In: Proceedings of the 26th Annual Computer Security Applications Conference, ACSAC ’10, New York, NY, USA, pp. 21–30. ACM. ISBN 978-1-4503-0133-6. https://doi.org/10.1145/1920261.1920265 (2010)

  8. Culotta, A., Ravi, N.K., Cutler, J: Predicting the demographics of twitter users from website traffic data. In: Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, AAAI’15, pp. 72–78. AAAI Press. ISBN 0-262-51129-0. http://dl.acm.org/citation.cfm?id=2887007.2887018 (2015)

  9. Daas, P.J., Burger, J., Le, Q., ten Bosch, O., Puts, M.J.: Profiling of Twitter Users: A Big Data Selectivity Study (2016)

    Google Scholar 

  10. Della Ratta, F., Pontecorvo, M.E., Vaccari, C., Virgillito, A.: Big data and textual analysis: a corpus selection from twitter. Rome between the fear of terrorism and the jubilee. https://www.researchgate.net/publication/303843023_Big_data_and_textual_analysis_a_corpus_selection_from_Twitter_Rome_between_the_fear_of_terrorism_and_the_Jubilee (2016)

  11. Gurajala, S., White, J.S., Hudson, B., Matthews, J.N.: Fake twitter accounts: profile characteristics obtained using an activity-based pattern detection approach. In: Proceedings of the 2015 International Conference on Social Media & Society, SMSociety ’15, New York, NY, USA, pp. 9:1–9:7. ACM. ISBN 978-1-4503-3923-0. https://doi.org/10.1145/2789187.2789206 (2015)

  12. Huang, W., Weber, I., Vieweg, S.: Inferring nationalities of twitter users and studying inter-national linking. In: Proceedings of the 25th ACM Conference on Hypertext and Social Media, HT ’14, New York, NY, USA, pp. 237–242. ACM. ISBN 978-1-4503-2954-5. https://doi.org/10.1145/2631775.2631825 (2014)

  13. ICTGlobus. Social media in italia: analisi dei flussi di utilizzo del 2016. https://www.ictglobus.com/social-media-in-italia-analisi-dei-flussi-di-utilizzo-del-2016/ (2017)

  14. Ikeda, K., Hattori, G., Matsumoto, K., Ono, C., Higashino, T.: Demographic estimation of twitter users for marketing analysis. IPSJ Trans. Consum. Devices Syst. 2(1), 82–93 (2012)

    Google Scholar 

  15. Ikeda, K., Hattori, G., Ono, C., Asoh, H., Higashino, T.: Twitter user profiling based on text and community mining for market analysis. Knowl.-Based Syst. 51(1), 35–47. ISSN 0950-7051. https://doi.org/10.1016/j.knosys.2013.06.020 (2013)

    Article  Google Scholar 

  16. Ito, J., Nishida, K., Hoshide, T., Toda, H., Uchiyama, T.: Demographic and psychographic estimation of twitter users using social structures, pp. 27–46. Springer International Publishing, Cham (2014). ISBN 978-3-319-13590-8. https://doi.org/10.1007/978-3-319-13590-8_2

    Google Scholar 

  17. Lee, K., Caverlee, J., Webb, S.: Uncovering social spammers: Social honeypots + machine learning. In Proceedings of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ’10, pp. 435–442, New York, NY, USA. ACM (2010). ISBN 978-1-4503-0153-4. https://doi.org/10.1145/1835449.1835522

  18. Liu, W., Ruths, D.: What’s in a name? using first names as features for gender inference in twitter. In: AAAI spring symposium: Analyzing microtext, vol. 13, p. 01 (2013)

    Google Scholar 

  19. Mislove, A., Jørgensen, S., Ahn, Y.-Y., Onnela, J.-P., Rosenquist, J.: Understanding the demographics of twitter users, pp. 554–557. AAAI Press (2011). ISBN 978-1-57735-505-2

    Google Scholar 

  20. Mohammady, E., Culotta, A.: Using county demographics to infer attributes of twitter users. ACL 2014, 7 (2014)

    Google Scholar 

  21. Nguyen, D., Smith, N.A., Rosé, C.P.: Author age prediction from text using linear regression. In: Proceedings of the 5th ACL-HLT Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities, LaTeCH ’11, pp. 115–123, Stroudsburg, PA, USA, 2011. Association for Computational Linguistics. ISBN 9781937284046. http://dl.acm.org/citation.cfm?id=2107636.2107651

  22. Paquet-Clouston, M., Bilodeau, O., Décary-Hétu, D.: Can we trust social media data?: Social network manipulation by an iot botnet. In: Proceedings of the 8th International Conference on Social Media & Society, #SMSociety17, pp. 15:1–15:9, New York, NY, USA. ACM. ISBN 978-1-4503-4847-8. https://doi.org/10.1145/3097286.3097301

  23. Pennacchiotti, M., Popescu, A.-M.: A machine learning approach to twitter user classification. In: ICWSM (2011)

    Google Scholar 

  24. Preotiuc-Pietro, D., Volkova, S., Lampos,V., Bachrach, Y., Aletras, N.: Studying user income through language, behaviour and affect in social media. PLOS One 10(9), 1–17 (2015). https://doi.org/10.1371/journal.pone.0138717

    Article  Google Scholar 

  25. Rao, D., Yarowsky, D., Shreevats, A., Gupta, M.: Classifying latent user attributes in twitter. In: Proceedings of the 2Nd International Workshop on Search and Mining User-generated Contents, SMUC ’10, pp. 37–44, New York, NY, USA. ACM. ISBN 978-1-4503-0386-6. https://doi.org/10.1145/1871985.1871993 (2010)

  26. Rao, D., Paul, M.J., Fink, C., Yarowsky, D., Oates, T., Coppersmith, G.: Hierarchical bayesian models for latent attribute detection in social media. In: Adamic, L.A., Baeza-Yates, R.A., Counts, S. (eds.) ICWSM. The AAAI Press. http://dblp.uni-trier.de/db/conf/icwsm/icwsm2011.html#RaoPFYOC11 (2011)

  27. Sakaki, S., Miura, Y., Ma, X., Hattori, K., Ohkuma, T.: Twitter user gender inference using combined analysis of text and image processing. V&L Net 2014, 54 (2014)

    Google Scholar 

  28. Schwartz, H.A., Eichstaedt, J.C., Kern, M.L., Dziurzynski, L., Lucas, R.E., Agrawal, M., Park, G.J., Lakshmikanth, S.K., Jha, S., Seligman, M.E. et al.: Characterizing geographic variation in well-being using tweets. In: ICWSM (2013)

    Google Scholar 

  29. Sloan, L.: Who tweets in the united kingdom? Profiling the twitter population using the british social attitudes survey 2015. Social Media + Society, 3(1), 2056305117698981 (2017). https://doi.org/10.1177/2056305117698981

    Article  Google Scholar 

  30. Sloan, L., Morgan, J.: Who tweets with their location? understanding the relationship between demographic characteristics and the use of geoservices and geotagging on twitter. PLOS One 10(11), 1–15 (2015). https://doi.org/10.1371/journal.pone.0142209

    Article  Google Scholar 

  31. Sloan, L., Morgan, J., Burnap, P., Williams, M.: Who tweets? deriving the demographic characteristics of age, occupation and social class from twitter user meta-data. PLOS One 10(3), 1–20 (2015). https://doi.org/10.1371/journal.pone.0115545

    Article  Google Scholar 

  32. Varol, O., Ferrara, E., Davis, C.A., Menczer, F., Flammini, A.: Online human-bot interactions: detection, estimation, and characterization. CoRR abs/1703.03107, http://arxiv.org/abs/1703.03107 (2017)

  33. Zamal, F.A., Liu, W., Ruths, D.: Homophily and latent attribute inference: inferring latent attributes of twitter users from neighbors. In: Breslin, J.G., Ellison, N.B., Shanahan, J.G., Tufekci, Z. (eds.) ICWSM. The AAAI Press. http://dblp.uni-trier.de/db/conf/icwsm/icwsm2012.html#ZamalLR12 (2012)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Righi Alessandra .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Alessandra, R., Gentile, M.M., Bianco, D.M. (2019). Who Tweets in Italian? Demographic Characteristics of Twitter Users. In: Petrucci, A., Racioppi, F., Verde, R. (eds) New Statistical Developments in Data Science. SIS 2017. Springer Proceedings in Mathematics & Statistics, vol 288. Springer, Cham. https://doi.org/10.1007/978-3-030-21158-5_25

Download citation

Publish with us

Policies and ethics