Advertisement

Journal of Computer Science and Technology

, Volume 33, Issue 6, pp 1307–1319 | Cite as

A New Method for Sentiment Analysis Using Contextual Auto-Encoders

  • Hanen AmeurEmail author
  • Salma Jamoussi
  • Abdelmajid Ben Hamadou
Regular Paper
  • 56 Downloads

Abstract

Sentiment analysis, a hot research topic, presents new challenges for understanding users’ opinions and judgments expressed online. They aim to classify the subjective texts by assigning them a polarity label. In this paper, we introduce a novel machine learning framework using auto-encoders network to predict the sentiment polarity label at the word level and the sentence level. Inspired by the dimensionality reduction and the feature extraction capabilities of the auto-encoders, we propose a new model for distributed word vector representation “PMI-SA” using as input pointwise-mutual-information “PMI” word vectors. The resulted continuous word vectors are combined to represent a sentence. An unsupervised sentence embedding method, called Contextual Recursive Auto-Encoders “CoRAE”, is also developed for learning sentence representation. Indeed, CoRAE follows the basic idea of the recursive auto-encoders to deeply compose the vectors of words constituting the sentence, but without relying on any syntactic parse tree. The CoRAE model consists in combining recursively each word with its context words (neighbors’ words: previous and next) by considering the word order. A support vector machine classifier with fine-tuning technique is also used to show that our deep compositional representation model CoRAE improves significantly the accuracy of sentiment analysis task. Experimental results demonstrate that CoRAE remarkably outperforms several competitive baseline methods on two databases, namely, Sanders twitter corpus and Facebook comments corpus. The CoRAE model achieves an efficiency of 83.28% with the Facebook dataset and 97.57% with the Sanders dataset.

Keywords

sentiment analysis recursive auto-encoder stacked auto-encoder pointwise mutual information deep embedding representation 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Supplementary material

11390_2018_1889_MOESM1_ESM.pdf (420 kb)
ESM 1 (PDF 420 kb)

References

  1. [1]
    Fu X, Xu Y. Recursive autoencoder with HowNet lexicon for sentence-level sentiment analysis. In Proc. ASE BigData and Social Informatics, Oct. 2015, Article No. 20.Google Scholar
  2. [2]
    Ameur H, Jamoussi S. Dynamic construction of dictionaries for sentiment classification. In Proc. the 13th IEEE International Conference on Data Mining Workshops, Dec. 2013, pp.896-903.Google Scholar
  3. [3]
    Cambria E, Schuller B, Xia Y, Havasi C. New avenues in opinion mining and sentiment analysis. IEEE Intelligent Systems, 2013, 28(2): 15-21.CrossRefGoogle Scholar
  4. [4]
    Socher R, Perelygin A, Wu J, Chuang J, Manning C D, Andrew Y N, Christopher P. Recursive deep models for semantic compositionality over a sentiment treebank. In Proc. Conference on Empirical Methods in Natural Language Processing, Oct. 2013, pp.1631-1642.Google Scholar
  5. [5]
    Yin H, Zhang C, Zhu Y, Ji Y. Representing sentence with unfolding recursive autoencoders and dynamic average pooling. In Proc. IEEE International Conference on Data Science and Advanced Analytics, Oct. 2014, pp.413-419.Google Scholar
  6. [6]
    Ameur H, Jamoussi S, Hamadou A B. Sentiment lexicon enrichment using emotional vector representation. In Proc. the 14th IEEE/ACS International Conference on Computer Systems and Applications, Oct. 2017, pp.951-958.Google Scholar
  7. [7]
    Rong W, Nie Y, Ouyang Y, Peng B, Xiong Z. Auto-encoder based bagging architecture for sentiment analysis. Journal of Visual Languages and Computing, 2014, 25(6): 840-849.CrossRefGoogle Scholar
  8. [8]
    Pang B, Lee L, Vaithyanathan S. Thumbs up? Sentiment classification using machine learning techniques. In Proc. ACL-02 Conference on Empirical Methods in Natural Language Processing, Volume 10, Jul. 2002, pp.79-86.Google Scholar
  9. [9]
    Bengio Y. Learning deep architectures for AI. Foundations and Trends® in Machine Learning, 2009, 2(1): 1-127.CrossRefGoogle Scholar
  10. [10]
    Blacoe W, Lapata M. A comparison of vector-based representations for semantic composition. In Proc. Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, Jul. 2012, pp.546-556.Google Scholar
  11. [11]
    Socher R, Pennington J, Huang E H, Ng A Y, Manning C D. Semi-supervised recursive autoencoders for predicting sentiment distributions. In Proc. the 11th Conference on Empirical Methods in Natural Language Processing, Jul. 2011, pp.151-161.Google Scholar
  12. [12]
    Poirier D. Des textes communautaires á la recommandation [Ph.D. Thesis], Orleans University, 2011. (in French)Google Scholar
  13. [13]
    Martineau J, Finin T. Delta TFIDF: An improved feature space for sentiment analysis. In Proc. the 3rd AAAI International Conference on Weblogs and Social Media, May 2009, pp.258-261.Google Scholar
  14. [14]
    Turney P D., Pantel P. From frequency to meaning: Vector space models of semantics. Journal of Artificial Intelligence Research, 2010, 37(1): 141-188.MathSciNetCrossRefGoogle Scholar
  15. [15]
    Chen L. Curse of dimensionality. In Encyclopedia of Database Systems, Liu L, Özsu M T (eds.), Springer, 2009, pp.545-546.Google Scholar
  16. [16]
    Mikolov T, Yih S W, Zweig G. Linguistic regularities in continuous space word representations. In Proc. the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, May 2013, pp.746-751.Google Scholar
  17. [17]
    Zhang P, Komachi M. Japanese sentiment classification with stacked denoising autoencoder using distributed word representation. In Proc. the 29th Pacific Asia Conference on Language, Information and Computation, Oct. 2015, pp.150-159.Google Scholar
  18. [18]
    Cho K, van Merrienboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y. Learning phrase representations using RNN encoder decoder for statistical machine translation. In Proc. the 2014 Conference on Empirical Methods in Natural Language Processing, Oct. 2014, pp.1724-1734.Google Scholar
  19. [19]
    Zhang Y, Er M J, Venkatesan R, Wang N, Pratama M. Sentiment classification using comprehensive attention recurrent models. In Proc. International Joint Conference on Neural Networks, July 2016, pp.1562-1569.Google Scholar
  20. [20]
    Kim Y. Convolutional neural networks for sentence classification. In Proc. the 2014 Conference on Empirical Methods in Natural Language Processing, Oct. 2014, pp.1746-1751.Google Scholar
  21. [21]
    Severyn A, Moschitti A. Twitter sentiment analysis with deep convolutional neural networks. In Proc. the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval, Aug. 2015, pp.959-962.Google Scholar
  22. [22]
    Sun X, Li C, Ren F. Sentiment analysis for Chinese microblog based on deep neural networks with convolutional extension features. Neurocomputing, 2016, 210: 227-236.CrossRefGoogle Scholar
  23. [23]
    Zhao R, Mao K. Topic-aware deep compositional models for sentence classification. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2017, 25(2): 248-260.CrossRefGoogle Scholar
  24. [24]
    Bengio Y. Deep learning of representations: Looking forward. arXiV:1305.0445, 2013. https://arXiv.org/abs/1305.0445/, May 2018.
  25. [25]
    Kumar V. Sentiment analysis using semi-supervised recursive autoencoder. In Proc. the International Joint Conferences on Web Intelligence and Intelligent Agent Technologies, December 2015.Google Scholar
  26. [26]
    Tang, Y. Deep learning using support vector machines. arXiV:1306.0239, 2013. https://arXiv.org/abs/1306.0239V/, May 2018.
  27. [27]
    Ebert S, Vu T N, Schütze H. CIS-positive: A combination of convolutional neural networks and support vector machines for sentiment analysis in Twitter. In Proc. the 9th International Workshop on Semantic Evaluation, Jun. 2015, pp.527-532.Google Scholar
  28. [28]
    Turian J, Ratinov L, Bengio Y. Word representations: A simple and general method for semi-supervised learning. In Proc. the 48th Annual Meeting of the Association for Computational Linguistics, Jul. 2010, pp.384-394.Google Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  • Hanen Ameur
    • 1
    Email author
  • Salma Jamoussi
    • 1
  • Abdelmajid Ben Hamadou
    • 1
  1. 1.Multimedia Information Systems and Advanced Computing Laboratory, Sfax University, Sfax TechnopoleSfaxTunisia

Personalised recommendations