A New Method for Sentiment Analysis Using Contextual Auto-Encoders
- 56 Downloads
Sentiment analysis, a hot research topic, presents new challenges for understanding users’ opinions and judgments expressed online. They aim to classify the subjective texts by assigning them a polarity label. In this paper, we introduce a novel machine learning framework using auto-encoders network to predict the sentiment polarity label at the word level and the sentence level. Inspired by the dimensionality reduction and the feature extraction capabilities of the auto-encoders, we propose a new model for distributed word vector representation “PMI-SA” using as input pointwise-mutual-information “PMI” word vectors. The resulted continuous word vectors are combined to represent a sentence. An unsupervised sentence embedding method, called Contextual Recursive Auto-Encoders “CoRAE”, is also developed for learning sentence representation. Indeed, CoRAE follows the basic idea of the recursive auto-encoders to deeply compose the vectors of words constituting the sentence, but without relying on any syntactic parse tree. The CoRAE model consists in combining recursively each word with its context words (neighbors’ words: previous and next) by considering the word order. A support vector machine classifier with fine-tuning technique is also used to show that our deep compositional representation model CoRAE improves significantly the accuracy of sentiment analysis task. Experimental results demonstrate that CoRAE remarkably outperforms several competitive baseline methods on two databases, namely, Sanders twitter corpus and Facebook comments corpus. The CoRAE model achieves an efficiency of 83.28% with the Facebook dataset and 97.57% with the Sanders dataset.
Keywordssentiment analysis recursive auto-encoder stacked auto-encoder pointwise mutual information deep embedding representation
Unable to display preview. Download preview PDF.
- Fu X, Xu Y. Recursive autoencoder with HowNet lexicon for sentence-level sentiment analysis. In Proc. ASE BigData and Social Informatics, Oct. 2015, Article No. 20.Google Scholar
- Ameur H, Jamoussi S. Dynamic construction of dictionaries for sentiment classification. In Proc. the 13th IEEE International Conference on Data Mining Workshops, Dec. 2013, pp.896-903.Google Scholar
- Socher R, Perelygin A, Wu J, Chuang J, Manning C D, Andrew Y N, Christopher P. Recursive deep models for semantic compositionality over a sentiment treebank. In Proc. Conference on Empirical Methods in Natural Language Processing, Oct. 2013, pp.1631-1642.Google Scholar
- Yin H, Zhang C, Zhu Y, Ji Y. Representing sentence with unfolding recursive autoencoders and dynamic average pooling. In Proc. IEEE International Conference on Data Science and Advanced Analytics, Oct. 2014, pp.413-419.Google Scholar
- Ameur H, Jamoussi S, Hamadou A B. Sentiment lexicon enrichment using emotional vector representation. In Proc. the 14th IEEE/ACS International Conference on Computer Systems and Applications, Oct. 2017, pp.951-958.Google Scholar
- Pang B, Lee L, Vaithyanathan S. Thumbs up? Sentiment classification using machine learning techniques. In Proc. ACL-02 Conference on Empirical Methods in Natural Language Processing, Volume 10, Jul. 2002, pp.79-86.Google Scholar
- Blacoe W, Lapata M. A comparison of vector-based representations for semantic composition. In Proc. Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, Jul. 2012, pp.546-556.Google Scholar
- Socher R, Pennington J, Huang E H, Ng A Y, Manning C D. Semi-supervised recursive autoencoders for predicting sentiment distributions. In Proc. the 11th Conference on Empirical Methods in Natural Language Processing, Jul. 2011, pp.151-161.Google Scholar
- Poirier D. Des textes communautaires á la recommandation [Ph.D. Thesis], Orleans University, 2011. (in French)Google Scholar
- Martineau J, Finin T. Delta TFIDF: An improved feature space for sentiment analysis. In Proc. the 3rd AAAI International Conference on Weblogs and Social Media, May 2009, pp.258-261.Google Scholar
- Chen L. Curse of dimensionality. In Encyclopedia of Database Systems, Liu L, Özsu M T (eds.), Springer, 2009, pp.545-546.Google Scholar
- Mikolov T, Yih S W, Zweig G. Linguistic regularities in continuous space word representations. In Proc. the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, May 2013, pp.746-751.Google Scholar
- Zhang P, Komachi M. Japanese sentiment classification with stacked denoising autoencoder using distributed word representation. In Proc. the 29th Pacific Asia Conference on Language, Information and Computation, Oct. 2015, pp.150-159.Google Scholar
- Cho K, van Merrienboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y. Learning phrase representations using RNN encoder decoder for statistical machine translation. In Proc. the 2014 Conference on Empirical Methods in Natural Language Processing, Oct. 2014, pp.1724-1734.Google Scholar
- Zhang Y, Er M J, Venkatesan R, Wang N, Pratama M. Sentiment classification using comprehensive attention recurrent models. In Proc. International Joint Conference on Neural Networks, July 2016, pp.1562-1569.Google Scholar
- Kim Y. Convolutional neural networks for sentence classification. In Proc. the 2014 Conference on Empirical Methods in Natural Language Processing, Oct. 2014, pp.1746-1751.Google Scholar
- Severyn A, Moschitti A. Twitter sentiment analysis with deep convolutional neural networks. In Proc. the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval, Aug. 2015, pp.959-962.Google Scholar
- Bengio Y. Deep learning of representations: Looking forward. arXiV:1305.0445, 2013. https://arXiv.org/abs/1305.0445/, May 2018.
- Kumar V. Sentiment analysis using semi-supervised recursive autoencoder. In Proc. the International Joint Conferences on Web Intelligence and Intelligent Agent Technologies, December 2015.Google Scholar
- Tang, Y. Deep learning using support vector machines. arXiV:1306.0239, 2013. https://arXiv.org/abs/1306.0239V/, May 2018.
- Ebert S, Vu T N, Schütze H. CIS-positive: A combination of convolutional neural networks and support vector machines for sentiment analysis in Twitter. In Proc. the 9th International Workshop on Semantic Evaluation, Jun. 2015, pp.527-532.Google Scholar
- Turian J, Ratinov L, Bengio Y. Word representations: A simple and general method for semi-supervised learning. In Proc. the 48th Annual Meeting of the Association for Computational Linguistics, Jul. 2010, pp.384-394.Google Scholar