Sentiment Classification in Multiple Languages: Fifty Shades of Customer Opinions

Kincl, Tomáš; Novák, Michal; Přibil, Jiří

doi:10.1007/978-3-319-22593-7_19

Sentiment Classification in Multiple Languages: Fifty Shades of Customer Opinions

Tomáš Kincl⁷,
Michal Novák⁷ &
Jiří Přibil⁷

Conference paper

911 Accesses
1 Citations

Part of the book series: Eurasian Studies in Business and Economics ((EBES,volume 2/2))

Abstract

Sentiment analysis is a natural language processing task where the goal is to classify the sentiment polarity of the expressed opinions, although the aim to achieve the highest accuracy in sentiment classification for one particular language, does not truly reflect the needs of business. Sentiment analysis is often used by multinational companies operating on multiple markets. Such companies are interested in consumer opinions about their products and services in different countries (thus in different languages). However, most of the research in multi-language sentiment classification simply utilizes automated translation from minor languages to English (and then conducting sentiment analysis for English). This paper aims to contribute to the multi-language sentiment classification problem and proposes a language independent approach which could provide a good level of classification accuracy in multiple languages without using automated translations or language-dependent components (i.e. lexicons). The results indicate that the proposed approach could provide a high level of sentiment classification accuracy, even for multiple languages and without the language dependent components.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Aisopos, F., Papadakis, G., Tserpes, K., & Varvarigou, T. (2012). Content vs. context for sentiment analysis: A comparative analysis over microblogs. In: Proceedings of the 23rd ACM Conference on Hypertext and Social Media, Milwaukee, WI, USA, June 25–28, 2012 (pp. 187–196). New York, NY: ACM.
Google Scholar
Aldred, J., Astell, A., Behr, R., Cochrane, L., Hind, J., Pickard, A., Potter, L., Wignall, A., & Wiseman, E. (2008). The world’s 50 most powerful blogs. The Guardian [online]. Accessed April 6, 2013, from http://www.guardian.co.uk/technology/2008/mar/09/blogs
Anderson, E. W. (1998). Customer satisfaction and word of mouth. Journal of Service Research, 1(1), 5–17.
Article Google Scholar
Anon. (n.d.a). Ähnliche Wörter Englisch–Deutsch. Wiktionary [online]. Accessed August 19, 2014, from http://de.wiktionary.org/wiki/Verzeichnis:Englisch/%C3%84hnliche_W%C3%B6rter_Englisch%E2%80%93Deutsch
Anon. (n.d.b). English-French relations. Wiktionary [online]. Accessed August 19, 2014, from http://en.wiktionary.org/wiki/Appendix:English-French_relations
Aue, A., & Gamon, M. (2005). Customizing sentiment classifiers to new domains: A case study. In Proceedings of the Recent Advances in Natural Language Processing RANLP 2005, Borovets, Bulgaria, September 21–23, 2005 (pp. 1–7). Microsoft Research.
Google Scholar
Banea, C., Mihalcea, R., & Wiebe, J. (2010). Multilingual subjectivity: Are more languages better? In Proceedings of the 23rd International Conference on Computational Linguistics, Beijing, China, August 23–27, 2010 (pp. 28–36). Association for Computational Linguistics.
Google Scholar
Berns, M., De Bot, K., & Hasebrink, U. (2007). In the presence of English: Media and European Youth. Berlin: Springer.
Book Google Scholar
Blamey, B., Crick, T., & Oatley, G. (2012). RU:-) or:-(? character-vs. word-gram feature selection for sentiment classification of OSN corpora. In Proceedings of the 32nd SGAI International Conference on Artificial Intelligence, Cambridge, UK, December 11–13, 2012 (pp. 207–212). Springer.
Google Scholar
Brooke, J., Tofiloski, M., & Taboada, M. (2009). Cross-linguistic sentiment analysis: From English to Spanish. In Proceedings of the Recent Advances in Natural Language Processing RANLP 2005, Borovets, Bulgaria, September 14–16, 2009, pp. 50–54.
Google Scholar
Cambria, E., Schuller, B., Xia, Y., & Havasi, C. (2013). New avenues in opinion mining and sentiment analysis. Intelligent Systems, 28(2), 15–21.
Article Google Scholar
Chang, C. C., & Lin, C. J. (2011). LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology (TIST), 2(3), 1–39.
Article ADS Google Scholar
Comcowich, W. J. (2010). Media monitoring: The complete guide. CyberAlert [online]. Accessed August 8, 2013, from http://www.cyberalert.com/downloads/media_monitoring_whitepaper.pdf
Cortes, C., & Vapnik, V. (1995). Support-vector networks. Machine Learning, 20(3), 273–297.
MATH Google Scholar
Escalante, H. J., Solorio, T., & Montes-Y-Gómez, M. (2011). Local histograms of character n-grams for authorship attribution. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies-Volume 1, Portland, OR, June 19–24, 2011 (pp. 288–298). Association for Computational Linguistics.
Google Scholar
Goldenberg, J., Libai, B., & Muller, E. (2001). Talk of the network: A complex systems look at the underlying process of word-of-mouth. Marketing Letters, 12(3), 211–223.
Article Google Scholar
Habernal, I., Ptácek, T., & Steinberger, J. (2013). Sentiment analysis in Czech social media using supervised machine learning. In: Proceedings of the Fourth Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis, Atlanta, GA, June 14, 2013, pp. 65–74.
Google Scholar
Horrigan, J. B. (2008). Online shopping. Pew Internet & American Life Project [online]. Washington, DC. Accessed August 8, 2014, from http://www.pewinternet.org/Reports/2008/Online-Shopping/01-Summary-of-Findings.aspx
Kanaris, I., Kanaris, K., Houvardas, I., & Stamatatos, E. (2007). Words versus character n-grams for anti-spam filtering. International Journal on Artificial Intelligence Tools, 16(06), 1047–1067.
Article Google Scholar
Liu, B. (2012). Sentiment analysis and opinion mining. Synthesis Lectures on Human Language Technologies, 5(1), 1–167.
Article Google Scholar
Maas, A. L., Daly, R. E., Pham, P. T., Huang, D., Ng, A. Y., & Potts, C. (2011). Learning word vectors for sentiment analysis. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies-Volume 1, Portland, OR, June 19–24, 2011 (pp. 142–150). Association for Computational Linguistics.
Google Scholar
Mansour, R., Refaei, N., Gamon, M., Abdul-Hamid, A., & Sami, K. (2013). Revisiting the old kitchen sink: Do we need sentiment domain adaptation? In Proceedings of the Recent Advances in Natural Language Processing, RANLP 2013, Hissar, Bulgaria, September 9–11, 2013, pp. 420–427.
Google Scholar
Pak, A., & Paroubek, P. (2010). Twitter as a corpus for sentiment analysis and opinion mining. In Proceedings of the International Conference on Language Resources and Evaluation, LREC, 2010, Valletta, Malta, May, 17–23, 2010, pp. 1320–1326.
Google Scholar
Pang, B., & Lee, L. (2008). Opinion mining and sentiment analysis. Foundations and Trends in Information Retrieval, 2(1–2), 1–135.
Article Google Scholar
Peng, F., Schuurmans, D., & Wang, S. (2003). Language and task independent text categorization with simple language models. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, NAACL '03, Edmonton, Canada, May 27–June 1, 2003 (pp. 110–117). Association for Computational Linguistics.
Google Scholar
Ptaszynski, M., Rzepka, R., Araki, K., & Momouchi, Y. (2011). Research on emoticons: review of the field and proposal of research framework. In Proceedings of the Seventeenth Annual Meeting of the Association for Natural Language Processing (NLP-2011) Toyohashi, Japan, March 7–11, 2011 (pp. 1159–1162). The Association for Natural Language Processing.
Google Scholar
Raaijmakers, S., & Kraaij, W. (2008). A shallow approach to subjectivity classification. In Proceedings of the Second International Conference on Weblogs and Social Media, ICWSM 2008, Seattle, WA, USA, March 30–April 2, 2008 (pp. 216–217). Association for the Advancement of Artificial Intelligence.
Google Scholar
Ritter, A., Clark, S., & Etzioni, O. (2011). Named entity recognition in tweets: An experimental study. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, EMNLP 2011, Edinburgh, UK, July, 27–31, 2011 (pp. 1524–1534). Association for Computational Linguistics.
Google Scholar
Rybina, K. (2012). Sentiment analysis of contexts around query terms in documents. Master’s thesis, Technische Universität Dresden.
Google Scholar
Taboada, M., Brooke, J., Tofiloski, M., Voll, K., & Stede, M. (2011). Lexicon-based methods for sentiment analysis. Computational Linguistics, 37(2), 267–307.
Article Google Scholar
Tsarfaty, R., Seddah, D., Goldberg, Y., Kuebler, S., Candito, M., Foster, J., Versley, Y., Rehbein, I., & Tounsi, L. (2010). Statistical parsing of morphologically rich languages (SPMRL): What, how and whither. In Proceedings of the First Workshop on Statistical Parsing of Morphologically-Rich Languages, NAACL HLT 2010, Los Angeles, CA, USA, June 5, 2010 (pp. 1–12). Association for Computational Linguistics.
Google Scholar
Ye, Q., Zhang, Z., & Law, R. (2009). Sentiment classification of online reviews to travel destinations by supervised machine learning approaches. Expert Systems with Applications, 36(3), 6527–6535.
Article Google Scholar
Zhang, L., Ghosh, R., Dekhil, M., Hsu, M., & Liu, B. (2011). Combining lexicon based and learning-based methods for twitter sentiment analysis(Technical Report HPL-2011-89). HP Laboratories.
Google Scholar

Download references

Author information

Authors and Affiliations

Faculty of Management, Department of Exact Methods, University of Economics, Prague, Czech Republic
Tomáš Kincl, Michal Novák & Jiří Přibil

Authors

Tomáš Kincl
View author publications
You can also search for this author in PubMed Google Scholar
Michal Novák
View author publications
You can also search for this author in PubMed Google Scholar
Jiří Přibil
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tomáš Kincl .

Editor information

Editors and Affiliations

Istanbul Medeniyet University, Faculty of Economics, Istanbul, Turkey
Mehmet Huseyin Bilgin
MUFG Union Bank, San Francisco, California, USA
Hakan Danis
Istanbul Medeniyet University, Faculty of Tourism, Istanbul, Turkey
Ender Demir
Eurasia Business & Economic Society, Fatih Istanbul, Turkey
Ugur Can

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kincl, T., Novák, M., Přibil, J. (2016). Sentiment Classification in Multiple Languages: Fifty Shades of Customer Opinions. In: Bilgin, M., Danis, H., Demir, E., Can, U. (eds) Business Challenges in the Changing Economic Landscape - Vol. 2. Eurasian Studies in Business and Economics, vol 2/2. Springer, Cham. https://doi.org/10.1007/978-3-319-22593-7_19

Download citation

DOI: https://doi.org/10.1007/978-3-319-22593-7_19
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-22592-0
Online ISBN: 978-3-319-22593-7
eBook Packages: Economics and FinanceEconomics and Finance (R0)

Publish with us

Policies and ethics