Advertisement

Soft Computing

, Volume 23, Issue 2, pp 599–611 | Cite as

An improved algorithm for sentiment analysis based on maximum entropy

  • Xin XieEmail author
  • Songlin Ge
  • Fengping Hu
  • Mingye Xie
  • Nan Jiang
Methodologies and Application

Abstract

Sentiment analysis is an important field of study in natural language processing. In the massive data and irregular data, sentiment classification with high accuracy is a major challenge in sentiment analysis. To address this problem, a novel maximum entropy-PLSA model is proposed. In this model, we first use the probabilistic latent semantic analysis to extract the seed emotion words from the Wikipedia and the training corpus. Then features are extracted from these seed emotion words, which are the input of the maximum entropy model for training the maximum entropy model. The test set is processed similarly into the maximum entropy model for emotional classification. Meanwhile, the training set and the test set are divided by the K-fold method. The maximum entropy classification based on probabilistic latent semantic analysis uses important emotional classification features to classify words, such as the relevance of words and parts of speech in the context, the relevance with degree adverbs, the similarity with the benchmark emotional words and so on. The experiments prove that the classification method proposed by this paper outperforms the compared methods.

Keywords

Semantic analysis Maximum entropy Probabilistic latent semantic analysis 

Notes

Acknowledgements

This work is supported by the National Natural Science Foundation, under Grant Nos. 61762037, 61640217, 41402290, 61462028, Science and Technology Support Program of Jiangxi Province, under Grant No. 20151BBE50055, and Science and Technology Project supported by education department of Jiangxi Province under Grant No. GJJ150541, and Nanchang City Sensor Network and Compressed Sensing Knowledge Innovation Team under Grant No. 2016T75.

Compliance with ethical standards

Conflict of interest

The authors declare that there is no conflict of interest regarding the publication of this paper.

References

  1. Berger AL, Pietra VJD, Pietra SAD (1996) A maximum entropy approach to natural language processing. Comput Linguist 22(1):39–71Google Scholar
  2. Brody S, Elhadad N (2009) Restaurant review corpus. http://people.dbmi.columbia.edu/noemie/ursa
  3. Brody S, Elhadad N (2013) An unsupervised aspect-sentiment model for online reviews. In: Human language technologies: conference of the North American chapter of the Association of Computational Linguistics, Proceedings, June 2–4, 2010. Los Angeles, California, USA, pp 804–812Google Scholar
  4. Burges CJC (1998) A tutorial on support vector machines for pattern recognition. Data Min Knowl Discov 2(2):121–167CrossRefGoogle Scholar
  5. Cheeseman P, Stutz J (1996) Bayesian classification (autoclass): theory and results. Fayyad U.m.etc. advances in Knowledge Discovery & Data Mining Aaai, pp 153–180Google Scholar
  6. Chen Q, Wenjie Li Y, Lei XL, He Y (2015) Learning to adapt credible knowledge in cross-lingual sentiment analysis. ACL 1:419–429Google Scholar
  7. Cheng K, Li J, Tang J, Liu H (2017) Unsupervised sentiment analysis with signed social networks. In: AAAI, pp 3429–3435Google Scholar
  8. Chen D, Wang D, Yu G, Yu F (2007) A PLSA-based approach for building user profile and implementing personalized recommendation. In: Advances in data and web management. Springer, pp 606–613Google Scholar
  9. Du K, Shi Y, Lei B, Chen J, Sun M (2016) A method of human action recognition based on spatio-temporal interest points and PLSA. In: 2016 international conference on industrial informatics-computing technology, intelligent technology, industrial information integration (ICIICII). IEEE, pp 69–72Google Scholar
  10. Ganu G, Elhadad N, Marian A (2009) Beyond the stars: improving rating predictions using review text content. In: International workshop on the web and databases, WEBDB (2009) Providence. Rhode Island, USA, JuneGoogle Scholar
  11. Gehring J, Miao Y, Metze F, Waibel A (2013) Extracting deep bottleneck features using stacked auto-encoders. In: 2013 IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE, pp 3377–3381Google Scholar
  12. Haidar MA, O’Shaughnessy D (2015) Document-specific context PLSA language model for speech recognition. In: 2015 IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE, pp 5326–5330Google Scholar
  13. Hofmann T (2001) Unsupervised learning by probabilistic latent semantic analysis. Mach Learn 42(1):177–196CrossRefzbMATHGoogle Scholar
  14. Hong HZ, Hwang JI (2015) Multimodal PLSA for movie genre classification. In: International workshop on multiple classifier systems. Springer, pp 159–167Google Scholar
  15. Huang F, Jing X, Sun S, Lu Y (2012) Incorporate spatial information into PLSA for scene classification. In: International conference on trustworthy computing and services. Springer, pp 170–177Google Scholar
  16. JyFantas (2014) PLSA. https://github.com/JFantasy/plsa
  17. Lipenkova J (2015) A system for fine-grained aspect-based sentiment analysis of Chinese. In: ACL (system demonstrations), pp 55–60Google Scholar
  18. Nguyen TH, Shirai K, Velcin J (2015 Modeling based sentiment analysis on social media for stock market prediction. In: The meeting of the association for computational linguistics and the international joint conference on natural language processing of the Asian Federation of natural language processingGoogle Scholar
  19. Ni X, Xue GR, Ling X, Yu Y, Yang Q (2007) Exploring in the weblog space by detecting informative and affective articles. In: Proceedings of the 16th international conference on World Wide Web. ACM, pp 281–290Google Scholar
  20. Pang B, Lee L (2002) Movie review data. http://www.cs.cornell.edu/people/pabo/movie-review-data
  21. Pang B, Lee L (2004) A sentimental education: sentiment analysis using subjectivity summarization based on minimum cuts. In: Proceedings of the 42nd annual meeting on Association for Computational Linguistics. Association for Computational Linguistics, p 271Google Scholar
  22. Pang B, Lee L, Vaithyanathan S (2002) Thumbs up? Sentiment classification using machine learning techniques. In: Proceedings of the ACL-02 conference on empirical methods in natural language processing-vol 10. Association for Computational Linguistics, pp 79–86Google Scholar
  23. Toutanova K (2004) Stanford log-linear part-of-speech tagger. http://nlp.stanford.edu/software/tagger.shtml
  24. Wang SY, Hsieh JW, Yan Y, Chen LC, Chen DY (2015a) PLSA-based sparse representation for vehicle color classification. In: 2015 12th IEEE international conference on advanced video and signal based surveillance (AVSS). IEEE, pp 1–6Google Scholar
  25. Wang Y, Wang S, Tang J, Liu H, Li B (2015b) Supervised sentiment analysis for social media images. In: IJCAI, pp 2378–2379Google Scholar
  26. Wang J, Fu J, Xu Y, Mei T (2016) Beyond object recognition: visual sentiment analysis with deep coupled adjective and noun neural networks. In: IJCAI, pp 3484–3490Google Scholar
  27. Wasilewski J, Hurley N (2016) Intent-aware diversification using a constrained PLSA. In: ACM conference on recommender systems, pp 39–42Google Scholar
  28. Xu WR, Liu DX, Guo J, Cai YC et al (2009) Supervised dual-PLSA for personalized SMS filtering. In: Asia information retrieval symposium. Springer, pp 254–264Google Scholar
  29. You Q, Jin H, Luo J (2017) Visual sentiment analysis by attending on local image regions. In: AAAI, pp 231–237Google Scholar
  30. You Q, Luo J, Jin H, Yang J (2015) Robust image sentiment analysis using progressively trained and domain transferred deep networks. arXiv preprint arXiv:1509.06041
  31. Zhang L (2015) A maximum entropy modeling toolkit for python and C++. https://github.com/lzhang10/maxent
  32. Zhang Y, Yuan Y, Guoren W (2015) A multimodal multimedia retrieval model based on PLSA. In: Web information system and application conference, pp 33–36Google Scholar
  33. Zhang M, Zhang Y, Vo DT (2016) Gated neural networks for targeted sentiment analysis. In: AAAI, pp 3087–3093Google Scholar
  34. Zhong C, Miao Z (2014) Modeling correlation between multi-modal continuous words for PLSA-based video classification. In: 2014 IEEE international conference on image processing (ICIP). IEEE, pp 4304–4308Google Scholar

Copyright information

© Springer-Verlag GmbH Germany 2017

Authors and Affiliations

  1. 1.School of Information EngineeringEast China Jiaotong UniversityNanchangPeople’s Republic of China
  2. 2.School of Civil EngineeringEast China Jiaotong UniversityNanchangPeople’s Republic of China
  3. 3.School of Information Science TechnologyEast China Normal UniversityShanghaiPeople’s Republic of China

Personalised recommendations