Abstract
This chapter outlines a process to use linguistic data collected from Twitter to predict for whom a person will vote. The linguistic analysis makes use of previous research into profiling based on frequencies of words in natural language. We use data collected from social media to train several machine-learning algorithms in order to make predictions regarding a user’s voting preference in context of the ongoing US presidential election. This study is solely exploratory—we test the feasibility of election prediction based exclusively on natural language usage in tweets and thus will not include any other parameters in the prediction model. We present a methodology to achieve an accuracy of above 60 % in predicting a user’s voting preference using only the most basic linguistic features and discuss possible extensions and shortcomings.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Asur S, Huberman BA (2010) Predicting the future with social media. In: Proceedings of the 2010 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology, vol 1, IEEE Computer Society, Washington, pp 492–499
Duggan M, Ellison NB, Lampe C, Lenhart A, Madden M (2015) Demographics of key social networking platforms. http://www.pewinternet.org/2015/01/09/demographics-of-key-social-networking-platforms-2/. Accessed 28 Feb 2016
Gayo-Avello D (2012) I wanted to predict elections with twitter and all I got was this Lousy Paper—A balanced survey on election prediction using twitter data. arXiv preprint arXiv:1204.6441
Jungherr A, Jürgens P, Schoen H (2011) Why the pirate party won the German election of 2009 or the trouble with predictions: A response to Tumasjan, A., Sprenger, T. O., Sander, P. G., & Welpe, I. M. “Predicting Elections with Twitter: What 140 characters reveal about political sentiment”. Social Sci Comput Rev 30(2):229–234
Kosinski M, Stillwell D, Graepel T (2013) Private traits and attributes are predictable from digital records of human behavior. Proc Natl Acad Sci U S A 110:5802–5805
Mondak JJ, Halperin KD (2008) A framework for the study of personality and political behaviour. Br J Polit Sci 38:335–362
Novak PK, Smailović J, Sluban B, Mozetič I (2015) Sentiment of emojis. PLoS One 10, e0144296
Pennebaker JW, Francis ME, Booth RJ (2001) Linguistic inquiry and word count: LIWC 2001. Lawrence Erlbaum Associates, Mahway, p 71
Sumner C (2012) Personality prediction based on twitter stream. https://www.kaggle.com/c/twitter-personality-prediction. Accessed 1 Feb 2016
Sumner C, Byers A, Shearing M (2011) Determining personality traits & privacy concerns from Facebook activity. Black Hat Briefings 11:197–221
Sumner C, Byers A, Boochever R, Park GJ (2012) Predicting dark triad personality traits from twitter usage and a linguistic analysis of tweets. In: Proceedings of the 2012 11th International Conference on Machine Learning and Applications (ICMLA), vol 2, IEEE Computer Society, Washington, pp 386–393
Tumasjan A, Sprenger TO, Sandner PG, Welpe IM (2010) What 140 characters reveal about political sentiment. In: Proceedings of the Fourth International AAAI Conference on Weblogs and Social Media, AAAI Press, Menlo Park
Twitter Inc (2013) New Tweets per second record, and how!. https://blog.twitter.com/2013/new-tweets-per-second-record-and-how. Accessed 14 Feb 2016
Verhulst B, Eaves LJ, Hatemi PK (2012) Correlation not causation: The relationship between personality traits and political ideologies. Am J Polit Sci 56:34–51
Wald R, Khoshgoftaar T, Sumner C (2012a) Machine prediction of personality from Facebook profiles. In: Proceedings of the 2012 IEEE 13th International Conference on Information Reuse and Integration (IRI), IEEE Computer Society, Washington, pp 109–115
Wald R, Khoshgoftaar TM, Napolitano A, Sumner C (2012b) Using twitter content to predict psychopathy. In: Proceedings of the 2012 11th International Conference on Machine Learning and Applications (ICMLA), vol 2, IEEE Computer Society, Washington, pp 394–401
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Bachhuber, J., Koppeel, C., Morina, J., Rejström, K., Steinschulte, D. (2016). US Election Prediction: A Linguistic Analysis of US Twitter Users. In: Zylka, M., Fuehres, H., Fronzetti Colladon, A., Gloor, P. (eds) Designing Networks for Innovation and Improvisation. Springer Proceedings in Complexity. Springer, Cham. https://doi.org/10.1007/978-3-319-42697-6_6
Download citation
DOI: https://doi.org/10.1007/978-3-319-42697-6_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-42696-9
Online ISBN: 978-3-319-42697-6
eBook Packages: Computer ScienceComputer Science (R0)