Domain-Tailored Multiclass Classification of User Reviews Based on Binary Splits

  • Alexandre Lunardi
  • José ViterboEmail author
  • Clodis Boscarioli
  • Flavia Bernardini
  • Cristiano Maciel
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9742)


Sentiment analysis can be performed using machine learning algorithms to automatically identify the sentiment associated with reviews about products or services available online. In many sentiment analysis practical scenarios, it is necessary to classify reviews in rates between 1 to 5 stars – a multiclass problem. In literature, we found that the best results for reviews classification are those who propose solutions based on binary splits, achieving accuracies above 90 %. As such, we propose a model, based on the Nested Dichotomies algorithm, that performs multiclass classification in successive steps of binary classification operations. For this classifier to be more effective, we propose that the first split should be defined by identifying users’ recommendation threshold. We present a case study in which this classification model is applied to a set of subjective data extracted from TripAdvisor, discuss the process of determining the first split and evaluate the accuracy of the proposed model.


Sentiment analysis Recommender systems Human centered design 


  1. 1.
    Constantinides, E., Romero, C.L., Boria, M.A.G.: Social media: a new frontier for retailers? Eur. Retail Res. 22, 1–28 (2008). Gabler VerlagCrossRefGoogle Scholar
  2. 2.
    Cambria, E., Schuller, B., Xia, Y., Havasi, C.: New avenues in opinion mining and sentiment analysis. IEEE Intell. Syst. 28(2), 15–21 (2013)CrossRefGoogle Scholar
  3. 3.
    Turney, P.D.: Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews. In: Proceedings of 40th Annual Meeting Association for Computational Linguistics (2002)Google Scholar
  4. 4.
    Pang, B., Lee, L.: Seeing stars: exploiting class relationships for sentiment categorization with respect to rating scales. In: Proceedings of 43rd Annual Meeting Association for Computational Linguistics (2005)Google Scholar
  5. 5.
    Paltoglou, G., Thelwall, M.: A study of information retrieval weighting schemes for sentiment analysis. In: Proceedings of 48th Annual Meeting Association for Computational Linguistics (2010)Google Scholar
  6. 6.
    Paltoglou, G., Thelwall, M.: Seeing stars of valence and arousal in blog posts. IEEE Trans. Affect. Comput. 4(1), 116–123 (2013)CrossRefGoogle Scholar
  7. 7.
    Frank, E., Kramer, S.: Ensembles of nested dichotomies for multi-class problems. In: Proceedings of 21st International Conference on Machine learning (2004)Google Scholar
  8. 8.
    Liu, B.: Sentiment Analysis and Opinion Mining. Morgan Claypool Publishers, San Rafael (2012)Google Scholar
  9. 9.
    Pang, B., Lee, L.: Opinion mining and sentiment analysis. Found. Trends Inf. Retr. 2(1), 1–135 (2008)CrossRefGoogle Scholar
  10. 10.
    Dave, K., Lawrence, S., Pennock, D.M.: Mining the peanut gallery: opinion extraction and semantic classification of product reviews. In: Proceedings of 12th International Conference on WWW (2003)Google Scholar
  11. 11.
    Lunardi, A., Viterbo, J., Bernardini, F.C.: Um Levantamento do Uso de Algoritmos de Aprendizado Supervisionado em Mineração de Opiniões (in portuguese). In: Proceedings of XII Encontro Nacional de Inteligência Artificial e Computacional – ENIAC, pp. 262–269 (2015)Google Scholar
  12. 12.
    Pang, B., Lee, L., Vaithyanathan, S.: Thumbs up sentiment classification using machine learning techniques. In: Proceedings of Conference Empirical Methods in Natural Language Processing (EMNLP), pp. 79–86 (2002)Google Scholar
  13. 13.
    Pak, A., Paroubek, P.: Twitter as a corpus for sentiment analysis and opinion mining. In: LREc 2010 (2010)Google Scholar
  14. 14.
    Kang, H., Seong, J.Y., Han, D.: Senti-lexicon and improved Naïve Bayes algorithms for sentiment analysis of restaurant reviews. Expert Syst. Appl. 39(5), 6000–6010 (2012)CrossRefGoogle Scholar
  15. 15.
    Pang, B., Lee, L.: Seeing stars: exploiting class relationships for sentiment categorization with respect to rating scales. In: Proceedings of 43rd Annual Meeting on Association for Computational Linguistics (2005)Google Scholar
  16. 16.
    Qu, L., Ifrim, G., Weikum, G.: The bag-of-opinions method for review rating prediction from sparse text patterns. In: Proceedings of 23rd International Conference Computational Linguistics (2010)Google Scholar
  17. 17.
    Long, C., Zhang, J., Zhut, X.: A review selection approach for accurate feature rating estimation. In: Proceedings of 23rd International Conference Computational Linguistics (2010)Google Scholar
  18. 18.
    de Albornoz, J.C., Plaza, L., Gervás, P., Díaz, A.: A joint model of feature mining and sentiment analysis for product review rating. In: Clough, P., Foley, C., Gurrin, C., Jones, G.J., Kraaij, W., Lee, H., Mudoch, V. (eds.) ECIR 2011. LNCS, vol. 6611, pp. 55–66. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  19. 19.
    Chen, C., Ibekwe-SanJuan, F., SanJuan, E., Weaver, C.: Visual analysis of conflicting opinions. In: 2006 Proceedings of IEEE Symposium on Visual Analytics Science and Technology, pp. 59–66 (2006)Google Scholar
  20. 20.
    Mak, H., Koprinska, I., Poon, J.: INTIMATE: a web-based movie recommender using text categorization. In: Proceedings of IEEE/WIC International Conference on Web Intelligence WI 2003, pp. 2–5 (2003)Google Scholar
  21. 21.
    Go, A., Bhayani, R., Huang, L.: Twitter sentiment classification using distant supervision. Processing 150(12), 1–6 (2009)Google Scholar
  22. 22.
    Ortigosa, A., Martín, J.M., Carro, R.M.: Sentiment analysis in Facebook and its application to e-learning. Comput. Hum. Behav. 31, 527–541 (2014)CrossRefGoogle Scholar
  23. 23.
    Rodríguez, J.J., García-Osorio, C., Maudes, J.: Forests of nested dichotomies. Pattern Recogn. Lett. 31(2), 125–132 (2010)CrossRefGoogle Scholar
  24. 24.
    Wang, H., Lu, Y., Zhai, C.: Latent aspect rating analysis on review text data. In: 2010 16th Proceedings of ACM SIGKDD International Conference on Knowledge Discovery and Data Mining - KDD 2010, p. 783 (2010)Google Scholar
  25. 25.
    Jiang, Z., Chan, J., Tan, B.C.Y., Chua, W.S.: Effects of interactivity on website involvement and purchase intention. J. Assoc. Inf. Syst. (JAIS) 11(1), 34–59 (2010)Google Scholar
  26. 26.
    Brooke, J.: A semantic approach to automated text sentiment analysis. Doctoral dissertation, Simon Fraser University (2009)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  • Alexandre Lunardi
    • 1
  • José Viterbo
    • 1
    Email author
  • Clodis Boscarioli
    • 2
  • Flavia Bernardini
    • 1
  • Cristiano Maciel
    • 3
  1. 1.Fluminense Federal University (UFF)NiteróiBrazil
  2. 2.Western Paraná State University (UNIOESTE)CascavelBrazil
  3. 3.Federal University of Mato Grosso (UFMT)CuiabáBrazil

Personalised recommendations