Abstract
Online social networks are a popular communication tool for internet users. Millions of users share opinions on different aspects of everyday life. Therefore, microblogging websites are rich sources of data for opinion mining and sentiment analysis. Our current research based on the analysis of migration using various social networks required to implement a tool for automated multilingual analysis of sentiment from as many languages as possible. Usually, all available tools handle to work only with English written texts which are the most common on the social media. Few open source tools which can process French, German and Spanish texts exist too, but it is not optimal to reimplement and join different approaches together. Another requirement is the ability to process dynamic data streams and static historical datasets with high efficiency. Lesser accuracy and completeness of evaluated messages is acceptable as a counterweight for these general requirements. The paper presents sample data collection from Twitter for the opinion mining purposes. We perform multilingual sentiment analysis of the collected data and briefly explain experimental results. The analysis is made with the use of custom built solution utilising the AFINN-165 which is manually evaluated dictionary of English words. This dictionary was translated into other languages using Google Translate API that was tested during the process. It is then possible to determine positive, negative and neutral sentiment. Results of the research bring new insights, offer a possibility for wider use and allow optimisation of the wordlists/tool resulting in the better results of future research. Geospatial analysis of first experimental results undercovers interesting relation between time, location and a sentiment which enables readers to think of various use cases.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Biever, C. (2010). Twitter mood maps reveal emotional states of America. New Scientist, 207, 14. doi:10.1016/S0262-4079(10)61833-7
Bollen, J., Mao, H., & Pepe, A. (2011). Modeling public mood and emotion: Twitter sentiment and socio-economic phenomena. ICWSM, 11, 450–453.
Duh, K., Fujino, A., & Nagata, M. (2011). Is machine translation ripe for cross-lingual sentiment classification? In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, HLT ’11, Short Papers (Vol. 2, pp. 429–433). Stroudsburg, PA, USA: Association for Computational Linguistics.
Google. (2016). Google translate API—Fast dynamic localization | google cloud platform [WWW Document]. Google Dev. URL:https://cloud.google.com/translate/. Accessed 6.20.16.
Hauthal, E., & Burghardt, D. (2015). Temporal occurrence and time-dependency of georeferenced emotions extracted from user-generated content. Presented at the 18th AGILE International Conference on Geographic Information Science, Lisbon.
Hauthal, E., & Burghardt, D. (2016). Mapping space-related emotions out of user-generated photo metadata considering grammatical issues. The Cartographic Journal, 53, 78–90. doi:10.1179/1743277414Y.0000000094
Horák, J., Belaj, P., Ivan, I., Nemec, P., Ardielli, J., & Růžička, J. (2011). Geoparsing of Czech RSS news and evaluation of its spatial distribution. In R. Katarzyniak, T.-F. Chiu, C.-F. Hong, & N. T. Nguyen (Eds.), Semantic methods for knowledge management and communication, studies in computational intelligence (pp. 353–367). Berlin, Heidelberg: Springer.
Ivan, I., Kocich, D., & Horák, J. (2016). Identification of crime environmental factors based on spatial human data integration. In: SGEM Conference Proceedings, Presented at the SGEM 2016 : 16th International Multidisciplinary Scientific Geoconference (Book2 Vol. 1, pp. 697–704), Albena, Bulgaria. doi:10.5593/SGEM2016/B21/S08.087
Kitchin, R. (2014). The real-time city? Big data and smart urbanism. GeoJournal, 79, 1–14. doi:10.1007/s10708-013-9516-8
Kocich, D. (2017). Afinn-165-multilingual [online]. Available from: https://github.com/dkocich/afinn-165-multilingual
Kocich, D., & Horák, J. (2016). Twitter as a source of big spatial data. In SGEM Conference Proceedings, Presented at the SGEM 2016 : 16th international multidisciplinary scientific geoconference (Book2 Vol. 1, pp. 921–928). Albena, Bulgaria. doi:10.5593/SGEM2016/B21/S08.116
Koehn, P., Och, F.J., & Marcu, D. (2003). Statistical phrase-based translation. In Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology—Volume 1, NAACL ’03 (pp. 48–54). Stroudsburg, PA, USA: Association for Computational Linguistics. doi:10.3115/1073445.1073462
Kotzias, D., Denil, M., de Freitas, N., & Smyth, P. (2015). From group to individual labels using deep features. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’15 (pp. 597–606). New York, NY, USA: ACM. doi:10.1145/2783258.2783380
Lampos, V., Bie, T. D., & Cristianini, N. (2010). Flu detector—Tracking epidemics on twitter. In J. L. Balcázar, F. Bonchi, A. Gionis, & M. Sebag (Eds.), Machine learning and knowledge discovery in databases (pp. 599–602)., Lecture Notes in Computer Science Berlin Heidelberg: Springer.
Letsch, C. (2014). Turkey twitter users flout Erdogan ban on micro-blogging site. The Guardian, 21.
Mislove, A., Lehmann, S., Ahn, Y.-Y., Onnela, J.-P., & Rosenquist, J.N. (2010). Pulse of the nation: U.S. mood throughout the day inferred from twitter [WWW Document]. URL:http://www.ccs.neu.edu/home/amislove/twittermood/. Accessed 7.1.16.
Nguyen, V. H., Nguyen, H. T., & Snasel, V. (2015). Normalization of vietnamese tweets on twitter. In A. Abraham, X. H. Jiang, V. Snášel, & J.-S. Pan (Eds.), Intelligent data analysis and applications, Advances in intelligent systems and computing (pp. 179–189). Berlin: Springer International Publishing.
Nielsen, F.Å. (2011). A new ANEW: Evaluation of a word list for sentiment analysis in microblogs. ArXiv: 11032903 Cs.
Pak, A., & Paroubek, P. (2010). Twitter as a corpus for sentiment analysis and opinion mining. In LREc (pp. 1320–1326).
Pánek, J., & Benediktsson, K. (2017). Emotional mapping and its participatory potential: Opinions about cycling conditions in Reykjavík, Iceland. Cities, 61, 65–73. doi:10.1016/j.cities.2016.11.005
Refaee, E., & Rieser, V. (2014). An arabic twitter corpus for subjectivity and sentiment analysis. In LREC (pp. 2268–2273).
Saravia, E., Argueta, C., & Chen, Y.-S. (2016). Unsupervised graph-based pattern extraction for multilingual emotion classification. Social Network Analysis and Mining, 6, 92. doi:10.1007/s13278-016-0403-4
Sun, S., Luo, C., & Chen, J. (2017). A review of natural language processing techniques for opinion mining systems. Information Fusion, 36, 10–25. doi:10.1016/j.inffus.2016.10.004
Wu, Y., Schuster, M., Chen, Z., Le, Q.V., Norouzi, M., Macherey, W., et al. (2016). Google’s neural machine translation system: bridging the gap between human and machine translation. ArXiv: 160908144 Cs.
Xiang, G., Fan, B., Wang, L., Hong, J., & Rose, C. (2012). Detecting offensive tweets via topical feature discovery over a large scale twitter corpus. In Proceedings of the 21st ACM International Conference on Information and Knowledge Management, CIKM ’12 (pp. 1980–1984). New York, NY, USA: ACM. doi:10.1145/2396761.2398556
Acknowledgements
The research is supported by the VŠB-Technical University of Ostrava, the Faculty of Mining and Geology, grant project Crowdsourced geodata, No. SP2016/41. Access to computing and storage facilities owned by parties and projects contributing to the National Grid Infrastructure MetaCentrum provided under the programme “Projects of Projects of Large Research, Development, and Innovations Infrastructures” (CESNET LM2015042), is greatly appreciated.
Supplementary Materials
Translated AFINN165 dictionary and customized version of sentiment library are available online on Github (dkocich/afinn-165-multilingual, dkocich/sentiment) (Kocich 2017).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG
About this paper
Cite this paper
Kocich, D. (2018). Multilingual Sentiment Mapping Using Twitter, Open Source Tools, and Dictionary Based Machine Translation Approach. In: Ivan, I., Horák, J., Inspektor, T. (eds) Dynamics in GIscience. GIS OSTRAVA 2017. Lecture Notes in Geoinformation and Cartography. Springer, Cham. https://doi.org/10.1007/978-3-319-61297-3_16
Download citation
DOI: https://doi.org/10.1007/978-3-319-61297-3_16
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-61296-6
Online ISBN: 978-3-319-61297-3
eBook Packages: Earth and Environmental ScienceEarth and Environmental Science (R0)