Abstract
The enormous growth of user-generated information of social networks has caused the need for new algorithms and methods for their classification. The Sentiment Analysis (SA) methods attempt to identify the polarity of a text, using among other resources, the ranking algorithms. One of the most popular ranking algorithms is the Okapi BM25 ranking, designed to rank documents according to their relevance on a topic. In this paper, we present an approach of sentiment analysis for Spanish Tweets based combining the BM25 ranking function with a Linear Support Vector supervised model. We describe the implemented procedure to adapt BM25 to the peculiarities of SA in Twitter. The results confirm the potential of the BM25 algorithm to improve the sentiment analysis tasks.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Workshop on Sentiment Analysis at SEPLN Conference.
- 2.
- 3.
References
Anta, A.F., Chiroque, L.N., Morere, P., Santos, A.: Sentiment analysis and topic detection of Spanish tweets: a comparative study of NLP techniques. Procesamiento del lenguaje natural 50, 45–52 (2013)
Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20(3), 273–297 (1995)
Cox, D.R.: The regression analysis of binary sequences. J. Roy. Stat. Soc. Series B (Methodological) 20, 215–242 (1958)
Esparza, S.G., O’Mahony, M.P., Smyth, B.: Mining the real-time web: a novel approach to product recommendation. Knowl.-Based Syst. 29, 3–11 (2012)
Fang, H., Tao, T., Zhai, C.: A formal study of information retrieval heuristics. In: Proceedings of the 27th Annual International ACM SIGIR (2004)
Friedman, J.H.: Greedy function approximation: a gradient boosting machine. Ann. Stat. 29, 1189–1232 (2001)
Gamallo, P., Garcia, M., Fernández-Lanza, S.: A Naive-Bayes strategy for sentiment analysis on Spanish tweets. In: Workshop on Sentiment Analysis at SEPLN (TASS 2013), pp. 126–132 (2013)
Han, B., Cook, P., Baldwin, T.: Unimelb: Spanish text normalisation. In: Tweet-Norm@ SEPLN (2013)
Hurtado, L.F., Pla, F., Buscaldi, D.: ELiRF-UPV en TASS 2015: Anlisis de Sentimientos en Twitter. In: TASS 2015: Workshop on Sentiment Analysis at SEPLN (2015)
Hurtado, L.F., Pla, F.: ELiRF-UPV en TASS 2014: Analisis de sentimientos, deteccin de tpicos y anlisis de sentimientos de aspectos en twitter. Procesamiento del Lenguaje Natural (2014)
Sparck-Jones, K., Walker, S., Robertson, S.E.: A probabilistic model of information retrieval: development and comparative experiments: Part 2. Info. Process. Manage. 36, 809–840 (2000)
Liu, S., Liu, F., Yu, C., Meng, W.: An effective approach to document retrieval via utilizing WordNet and recognizing phrases. In: Proceedings of the 27th Annual International ACM SIGIR (2004)
Pla, F., Hurtado, L.-F.: Sentiment analysis in twitter for Spanish. In: Métais, E., Roche, M., Teisseire, M. (eds.) NLDB 2014. LNCS, vol. 8455, pp. 208–213. Springer, Heidelberg (2014)
Robertson, S.E., Walker, S.: Some simple effective approximations to the 2-poisson model for probabilistic weighted retrieval. In: Proceedings of the 17th Annual International ACM SIGIR Conference (1994)
Robertson, S.E., Walker, S., Jones, S., Hancock-Beaulieu, M.M., Gatford, M.: Okapi at TREC-3. NIST SPECIAL PUBLICATION SP (1995)
Robertson, S., Zaragoza, H.: The Probabilistic Relevance Framework: BM25 and Beyond. Now Publishers Inc., Hanover (2009)
Sixto, J., Almeida, A., López-de-Ipiña, D.: DeustoTech Internet at TASS 2015: Sentiment analysis and polarity classification in spanish tweets. In: TASS 2015: Workshop on Sentiment Analysis at SEPLN (2015)
Thelwall, M., Buckley, K., Paltoglou, G.: Sentiment strength detection for the social web. J. Am. Soc. Inform. Sci. Technol. 63(1), 163–173 (2012)
Valverde, J., Tejada, J., Cuadros, E.: Comparing Supervised Learning Methods for Classifying Spanish Tweets. In: TASS 2015: Workshop on Sentiment Analysis at SEPLN, p. 87 (2015). Comit organizador
Villena-Román, J., García-Morera, J., García-Cumbreras, M., Martínez-Cámara, E., Martín-Valdivia, M., Ureña-López, L.: Overview of TASS 2015. In: TASS 2015: Workshop on Sentiment Analysis at SEPLN (2015)
Zhang, W., Yu, C., Meng, W.: Opinion retrieval from blogs. In: Proceedings of the Sixteenth ACM Conference (2007)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Sixto, J., Almeida, A., López-de-Ipiña, D. (2016). Improving the Sentiment Analysis Process of Spanish Tweets with BM25. In: Métais, E., Meziane, F., Saraee, M., Sugumaran, V., Vadera, S. (eds) Natural Language Processing and Information Systems. NLDB 2016. Lecture Notes in Computer Science(), vol 9612. Springer, Cham. https://doi.org/10.1007/978-3-319-41754-7_26
Download citation
DOI: https://doi.org/10.1007/978-3-319-41754-7_26
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-41753-0
Online ISBN: 978-3-319-41754-7
eBook Packages: Computer ScienceComputer Science (R0)