Abstract
Spanish is the third language most used on the internet. However, Natural Language Processing research in this language is still far below the level of other languages like English. The aim of this paper is to fill this gap in the literature and to provide a comprehensive assessment of Deep Learning applied to Spanish sentiment analysis. We focus on the polarity detection task which, in the context of Spanish Twitter messages, remains as a challenging task. To do so, we explore the combination of several Word representations (Word2Vec, Glove, Fastext) and Deep Neural Networks models. Unlike poor performance obtained by previous related work using Deep Learning for Spanish sentiment analysis, we show promising results. Our best setting combines three word embeddings representations, Convolutional Neural Networks and Recurrent Neural Networks. This setup allows us to obtain state-of-the-art results on the TASS/SEPLN 2017 Spanish Twitter benchmark dataset, in terms of accuracy and macro F1-measure.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
The following tool was used to perform POS tagging: http://www.cis.uni-muenchen.de/~schmid/tools/TreeTagger/.
- 2.
References
Araque, O., Barbado, R., Sanchez-Rada, J.F., Iglesias, C.A.: Applying recurrent neural networks to sentiment analysis of Spanish tweets. In: Proceedings of TASS 2017: Workshop on Sentiment Analysis at SEPLN, pp. 71–76 (2017)
Bengio, Y., Ducharme, R., Vincent, P., Janvin, C.: A neural probabilistic language model. J. Mach. Learn. Res. 3, 1137–1155 (2003). http://dl.acm.org/citation.cfm?id=944919.944966
Blair-goldensohn, S., Neylon, T., Hannan, K., Reis, G.A., Mcdonald, R., Reynar, J.: Building a sentiment summarizer for local service reviews. In: NLP in the Information Explosion Era (2008)
Bojanowski, P., Grave, E., Joulin, A., Mikolov, T.: Enriching word vectors with subword information. Trans. Assoc. Comput. Linguist. 5, 135–146 (2017)
Brody, S., Elhadad, N.: An unsupervised aspect-sentiment model for online reviews. In: Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, HLT 2010, pp. 804–812. Association for Computational Linguistics, Stroudsburg (2010). http://dl.acm.org/citation.cfm?id=1857999.1858121
Brooke, J., Tofiloski, M., Taboada, M.: Cross-linguistic sentiment analysis: from English to Spanish. In: Proceedings of RANLP 2009, pp. 50–54 (2009)
Ceron-Guzman, J.A.: Classier ensembles that push the state-of-the-art in sentiment analysis of Spanish tweets. In: Proceedings of TASS 2017: Workshop on Sentiment Analysis at SEPLN, pp. 59–64 (2017)
Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., Kuksa, P.P.: Natural language processing (almost) from scratch. CoRR abs/1103.0398 (2011). http://arxiv.org/abs/1103.0398
Elman, J.L.: Finding structure in time. Cogn. Sci. 14(2), 179–211 (1990)
Garcia, M., Martinez, E., Villena, J., Garcia, J.: TASS 2015 - the evolution of the spanish opinion mining systems. Procesamiento de Lenguaje Natural 56, 33–40 (2016)
Garcia-Cumbreras, M.A., Villena-Roman, J., Martinez-Camara, E., Diaz-Galiano, M., Martin-Valdivia, T., Ureña Lopez, A.: Overview of TASS 2016. In: Proceedings of TASS 2016: Workshop on Sentiment Analysis at SEPLN, pp. 13–21 (2016)
Garcia-Vega, M., Montejo-Raez, A., Diaz-Galiano, M.C., Jimenez-Zafra, S.M.: Sinai in TASS 2017: tweet polarity classification integrating user information. In: Proceedings of TASS 2017: Workshop on Sentiment Analysis at SEPLN, pp. 91–96 (2017)
Graves, A.: Supervised Sequence Labelling with Recurrent Neural Networks. Studies in Computational Intelligence, vol. 385. Springer, Berlin (2012). https://doi.org/10.1007/978-3-642-24797-2. https://cds.cern.ch/record/1503877
Hurtado, L.F., Pla, F., Gonzalez, J.A.: ELiRF-UPV at TASS 2017: sentiment analysis in twitter based on deep learning. In: Proceedings of TASS 2017: Workshop on Sentiment Analysis at SEPLN, pp. 29–34 (2017)
Kim, Y.: Convolutional neural networks for sentence classification. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, EMNLP 2014, Doha, Qatar, 25–29 October 2014, A meeting of SIGDAT, a Special Interest Group of the ACL, pp. 1746–1751 (2014). http://aclweb.org/anthology/D/D14/D14-1181.pdf
LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)
Liu, B.: Sentiment Analysis and Opinion Mining. Morgan and Claypool Publishers (2012)
Martinez-Camara, E., Diaz-Galiano, M., Garcia-Cumbreras, M.A., Garcia-Vega, M., Villena-Roman, J.: Overview of TASS 2017. In: Proceedings of TASS 2017: Workshop on Sentiment Analysis at SEPLN, pp. 13–21 (2017)
McGlohon, M., Glance, N., Reiter, Z.: Star quality: Aggregating reviews to rank products and merchants. In: Proceedings of Fourth International Conference on Weblogs and Social Media (ICWSM) (2010)
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Burges, C.J.C., Bottou, L., Welling, M., Ghahramani, Z., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems 26, pp. 3111–3119. Curran Associates, Inc. (2013). http://papers.nips.cc/paper/5021-distributed-representations-of-words-and-phrases-and-their-compositionality.pdf
Moreno-Ortiz, A., Perez-Hernendez, C.: Tecnolengua Lingmotif at TASS 2017: Spanish twitter dataset classification combining wide-coverage lexical resources and text features. In: Proceedings of TASS 2017: Workshop on Sentiment Analysis at SEPLN, pp. 35–42 (2017)
Narayanan, V., Arora, I., Bhatia, A.: Fast and accurate sentiment classification using an enhanced naive bayes model. In: Yin, H., et al. (eds.) IDEAL 2013. LNCS, vol. 8206, pp. 194–201. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-41278-3_24
Neubig, G.: Neural machine translation and sequence-to-sequence models: A tutorial. CoRR abs/1703.01619 (2017). http://arxiv.org/abs/1703.01619
Pang, B., Lee, L.: Opinion mining and sentiment analysis. Found. Trends Inf. Retr. 2(1–2), 1–135 (2008). https://doi.org/10.1561/1500000011
Paredes-Valverde, M.A., Colomo-Palacios, R., Salas-Zarate, M.D.P., Valencia-Garcia, R.: Sentiment analysis in Spanish for improvement of products and services: a deep learning approach. Sci. Program. 6, 1–6 (2017)
Pennington, J., Socher, R., Manning, C.D.: Glove: global vectors for word representation. In: Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543 (2014). http://www.aclweb.org/anthology/D14-1162
Rosa, A., Chiruzzo, L., Etcheverry, M., Castro, S.: RETUYT in TASS 2017: sentiment analysis for Spanish tweets using SVM and CNN. In: Proceedings of TASS 2017: Workshop on Sentiment Analysis at SEPLN, pp. 77–83 (2017)
Segura-Bedmar, I., Quiros, A., Martínez, P.: Exploring convolutional neural networks for sentiment analysis of Spanish tweets. In: Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: vol. 1, Long Papers, pp. 1014–1022. Association for Computational Linguistics (2017). http://aclweb.org/anthology/E17-1095
Tang, D., Wei, F., Qin, B., Yang, N., Liu, T., Zhou, M.: Sentiment embeddings with applications to sentiment analysis. IEEE Trans. Knowl. Data Eng. 28(2), 496–509 (2016)
Tang, D., Qin, B., Liu, T.: Deep learning for sentiment analysis: successful approaches and future challenges. Wiley Interdisc. Rev.: Data Mining Knowl. Discov. 5(6), 292–303 (2015)
Turney, P.D.: Thumbs up or thumbs down?: semantic orientation applied to unsupervised classification of reviews. In: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, ACL 2002, pp. 417–424. Association for Computational Linguistics, Stroudsburg (2002). https://doi.org/10.3115/1073083.1073153
Vilares, D., Doval, Y., Alonso, M.A., Gomez-Rodriguez, C.: LyS at TASS 2015: deep learning experiments for sentiment analysis on Spanish tweets. In: Proceedings of TASS 2015: Workshop on Sentiment Analysis at SEPLN, pp. 47–52 (2015)
Zhang, L., Wang, S., Liu, B.: Deep learning for sentiment analysis: a survey. CoRR abs/1801.07883 (2018). http://arxiv.org/abs/1801.07883
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Ochoa-Luna, J., Ari, D. (2019). Word Embeddings and Deep Learning for Spanish Twitter Sentiment Analysis. In: Lossio-Ventura, J., Muñante, D., Alatrista-Salas, H. (eds) Information Management and Big Data. SIMBig 2018. Communications in Computer and Information Science, vol 898. Springer, Cham. https://doi.org/10.1007/978-3-030-11680-4_4
Download citation
DOI: https://doi.org/10.1007/978-3-030-11680-4_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-11679-8
Online ISBN: 978-3-030-11680-4
eBook Packages: Computer ScienceComputer Science (R0)