A Text Sentimental Analysis Method Based on Dimension Reduction of CHI Multi-gram Features Mixture

  • Fulian Yin
  • Yanyan WangEmail author
  • Jianbo Liu
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 1075)


To address the problem of increasing computation caused by high-dimensional features, we propose a method for text sentimental analysis based on dimension reduction of Chi-square statistic (CHI) multi-grams mixture in this paper. It can not only effectively improve the effect of feature extraction, but also precisely determine the feature dimensions, which is different from the traditional methods using experience value. Experimental results show that the proposed method outperforms the exiting methods and the highest accuracy rate reached 94.85%. Moreover, it is proved that our method is universal for the subjective and objective classification as well as the different length of text classification reviews.


Chi-square statistics Multi-grams mixture Principal component analysis Text sentiment analysis 



This work is supported by the National Natural Science Foundation of China (No. 61801440), the Fundamental Research Funds for the Central Universities and the Communication University of China’s state-of-the-art training research project (No. CUC18A015-1).


  1. 1.
    Tang, D., Wei, F., Yang, N., et al.: Learning sentiment-specific word embedding for Twitter sentiment classification. In: The 52nd Annual Meeting of the Association for Computational Linguistics, pp. 1555–1565 (2014)Google Scholar
  2. 2.
    Pak, A., Paroubek, P.: Twitter as a corpus for sentiment analysis and opinion mining. In: Proceeding of International Conference on Language Resources and Evaluation, pp. 17–23. European Languages Resources Association (ELRA), Valletta, Malta (2010)Google Scholar
  3. 3.
    Pang, B., Lee, L., Vaithyanathan, S.: Thumbs up?: sentiment classification using machine learning techniques. In: Proceeding of Empirical Methods in Natural Language Processing, pp. 79–86 (2002)Google Scholar
  4. 4.
    Venkatasubramanian, S., Veilumuthu, A., Krishnamurthy, A., et al.: A non-syntactic approach for text sentiment classification with stopwords. In: Proceeding of the ACM International Conference Companion on World Wide Web, pp. 137–138 (2011)Google Scholar
  5. 5.
    Duwairi, R.M., Qarqaz, I.: Arabic sentiment analysis using supervised classification. In: Proceeding of International Conference on Future Internet of Things and Cloud, pp. 579–583. IEEE (2014)Google Scholar
  6. 6.
    Onan, A., Korukoğlu, S., Bulut, H.: A multiobjective weighted voting ensemble classifier based on differential evolution algorithm for text sentiment classification. Expert Syst. Appl. 62, 1–16 (2016)CrossRefGoogle Scholar
  7. 7.
    Li, H., Chai, Y.: Analyzing sentiment polarity of comments based on attributes. Data Anal. Knowl. Discov. 1(10), 1–11 (2017)Google Scholar
  8. 8.
    Dumais, S.T.: Latent semantic analysis. Annu. Rev. Inf. Sci. Technol. 38(1), 188–230 (2005)CrossRefGoogle Scholar
  9. 9.
    Lifchitz, A., Jheanlarose, S., Denhière, G.: Effect of tuned parameters on an LSA multiple choice questions answering model. Behav. Res. Methods 41(4), 1201–1209 (2009)CrossRefGoogle Scholar
  10. 10.
    Gálvez, R.H., Gravano, A.: Assessing the usefulness of online message board mining in automatic stock prediction systems. J. Comput. Sci. 19, 43–56 (2017)CrossRefGoogle Scholar
  11. 11.
    Hu, X., Tang, J., Gao, H., et al.: Unsupervised sentiment analysis with emotional signals. In: Proceeding of the ACM International Conference on World Wide Web, pp. 607–618 (2013)Google Scholar
  12. 12.
    Tang, D., Wei, F., Qin, B., et al.: Sentiment embeddings with applications to sentiment analysis. IEEE Trans. Knowl. Data Eng. 28(2), 496–509 (2016)CrossRefGoogle Scholar
  13. 13.
    Pang, B., Lee, L.: Seeing stars: exploiting class relationships for sentiment categorization with respect to rating scales. In: Proceeding of Association for Computational Linguistics. Association for Computational Linguistics, pp. 115–124 (2005)Google Scholar
  14. 14.
    Pearson, K.: On lines and planes of closest fit to systems of points in space. Phil. Mag. 2(6), 559–572 (1901)CrossRefGoogle Scholar
  15. 15.
    Abdi, H., Williams, L.J.: Principal component analysis. Wiley Interdisc. Rev. Comput. Stat. 2(4), 433–459 (2010)CrossRefGoogle Scholar
  16. 16.
    Pang, B., Lee, L.: A sentimental education: sentiment analysis using subjectivity summarization based on minimum cuts. In: Proceeding of Association for Computational Linguistics, pp. 271–278. Association for Computational Linguistics (2004)Google Scholar
  17. 17.
    Maas, A.L., Daly, R.E., Pham, P.T., et al.: Learning word vectors for sentiment analysis. In: Proceeding of the Association for Computational Linguistics: Human Language Technologies, pp. 142–150. Association for Computational Linguistics (2011)Google Scholar
  18. 18.
    Kennedy, A., Inkpen, D.: Sentiment classification of movie reviews using contextual valence shifters. Comput. Intell. 22(2), 110–125 (2006)MathSciNetCrossRefGoogle Scholar
  19. 19.
    Martineau, J., Finin, T.: Delta TFIDF: An improved feature space for sentiment analysis. In: Proceeding of the International Conference on Weblogs and Social Media, pp. 258–261 (2009)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.College of Information and Communication EngineeringCommunication University of ChinaBeijingPeople’s Republic of China

Personalised recommendations