Learning and Knowledge-Based Sentiment Analysis in Movie Review Key Excerpts

  • Björn Schuller
  • Tobias Knaup
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6456)


We propose a data-driven approach based on back-off N-Grams and Support Vector Machines, which have recently become popular in the fields of sentiment and emotion recognition. In addition, we introduce a novel valence classifier based on linguistic analysis and the on-line knowledge sources ConceptNet, General Inquirer, and WordNet. As special benefit, this approach does not demand labeled training data. Moreover, we show how such knowledge sources can be leveraged to reduce out-of-vocabulary events in learning-based processing. To profit from both of the two generally different concepts and independent knowledge sources, we employ information fusion techniques to combine their strengths, which ultimately leads to better overall performance. Finally, we extend the data-driven classifier to solve a regression problem in order to obtain a more fine-grained resolution of valence.


Sentiment Analysis Emotion Recognition 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Breiman, L.: Random forests. Machine Learning 45(1), 5–32 (2001)CrossRefzbMATHGoogle Scholar
  2. 2.
    Das, S.R., Chen, M.Y.: Yahoo! for amazon: Sentiment parsing from small talk on the web. In: Proceedings of the 8th Asia Pacific Finance Association Annual Conference (2001)Google Scholar
  3. 3.
    Dave, K., Lawrence, S., Pennock, D.M.: Mining the peanut gallery: opinion extraction and semantic classification of product reviews. In: Proceedings of the 12th international conference on World Wide Web, pp. 519–528. ACM, Budapest (2003)Google Scholar
  4. 4.
    Ding, X., Liu, B., Yu, P.S.: A holistic lexicon-based approach to opinion mining. In: WSDM 2008: Proceedings of the International Conference on Web Search and Web Data Mining, pp. 231–240. ACM, New York (2008)Google Scholar
  5. 5.
    Esuli, A., Sebastiani, F.: Determining term subjectivity and term orientation for opinion mining. In: Proceedings of the 11th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2006), Trento, Italy (2006)Google Scholar
  6. 6.
    Fellbaum, C.: Wordnet: An Electronic Lexical Database. MIT Press, Cambridge (1998)zbMATHGoogle Scholar
  7. 7.
    Gillick, L., Cox, S.J.: Some statistical issues in the comparison of speech recognition algorithms. In: Proceedings of the International Conference on Audio Speech and Signal Processing (ICASSP), vol. I, pp. 23–26. Glasgow, Scotland (1989)Google Scholar
  8. 8.
    Havasi, C., Speer, R., Alonso, J.: Conceptnet 3: a flexible, multilingual semantic network for common sense knowledge. In: Recent Advances in Natural Language Processing. Borovets, Bulgaria (September 2007)Google Scholar
  9. 9.
    Joachims, T.: Text categorization with support vector machines: learning with many relevant features. In: Nédellec, C., Rouveirol, C. (eds.) ECML 1998. LNCS, vol. 1398, pp. 137–142. Springer, Heidelberg (1998)CrossRefGoogle Scholar
  10. 10.
    Jurafsky, D., Martin, J.H.: Speech and Language Processing. Prentice-Hall, Englewood Cliffs (2000)Google Scholar
  11. 11.
    Katz, B.: From sentence processing to information access on the world wide web. In: Proceedings of the AAAI Spring Symposium on Natural Language Processing for the World Wide Web, pp. 77–86 (1997)Google Scholar
  12. 12.
    Liu, B., Hu, M., Cheng, J.: Opinion observer: analyzing and comparing opinions on the web. In: WWW 2005: Proceedings of the 14th International Conference on World Wide Web, pp. 342–351. ACM, New York (2005)Google Scholar
  13. 13.
    Liu, H., Lieberman, H., Selker, T.: A model of textual affect sensing using real-world knowledge. In: IUI 2003: Proceedings of the 8th International Conference on Intelligent User Interfaces, pp. 125–132. ACM, New York (2003)Google Scholar
  14. 14.
    Lizhong, W., Oviatt, S., Cohen, P.R.: Multimodal integration – a statistical view. IEEE Transactions on Multimedia 1, 334–341 (1999)CrossRefGoogle Scholar
  15. 15.
    Marcus, M., Marcinkiewicz, M., Santorini, B.: Building a large annotated corpus of english: the Penn Treebank. Computational Linguistics 19(2), 313–330 (1993)Google Scholar
  16. 16.
    Morinaga, S., Yamanishi, K., Tateishi, K., Fukushima, T.: Mining product reputations on the web. In: KDD 2002: Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 341–349. ACM, New York (2002)Google Scholar
  17. 17.
    Pang, B., Lee, L., Vaithyanathan, S.: Thumbs up?: sentiment classification using machine learning techniques. In: Proceedings of EMNLP 2002, Morristown, NJ, USA. Association for Computational Linguistics, pp. 79–86 (2002)Google Scholar
  18. 18.
    Platt, J.C.: Fast training of support vector machines using sequential minimal optimization, pp. 185–208. MIT Press, Cambridge (1999)Google Scholar
  19. 19.
    Popescu, A., Etzioni, O.: Extracting product features and opinions from reviews. In: Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Morristown, NJ, USA, pp. 339–346 (2005)Google Scholar
  20. 20.
    Porter, M.F.: An algorithm for suffix stripping. Program 14(3), 130–137 (1980)CrossRefGoogle Scholar
  21. 21.
    Schuller, B., Steidl, S., Batliner, A.: The interspeech 2009 emotion challenge. In: Proceedings of the Interspeech, Brighton, UK, pp. 312–315 (2009)Google Scholar
  22. 22.
    Sha, F., Pereira, F.: Shallow parsing with conditional random fields. In: NAACL 2003: Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology. Association for Computational Linguistics, Morristown, NJ, USA, pp. 134–141 (2003)Google Scholar
  23. 23.
    Stone, P., Kirsh, J., Associates, C.C.: The General Inquirer: A Computer Approach to Content Analysis. MIT Press, Cambridge (1966)Google Scholar
  24. 24.
    Turney, P.D.: Thumbs up or thumbs down? semantic orientation applied to unsupervised classification of reviews. In: Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics (ACL), Philadelphia, pp. 417–424 (July 2002)Google Scholar
  25. 25.
    Turney, P.D., Littman, M.L.: Measuring praise and criticism: Inference of semantic orientation from association. ACM Transactions on Information Systems 21(4), 315–346 (2003)CrossRefGoogle Scholar
  26. 26.
    Wiebe, J., Wilson, T., Bell, M.: Identifying collocations for recognizing opinions. In: Proceedings of the ACL 2001 Workshop on Collocation: Computational Extraction, Analysis, and Exploitation, pp. 24–31 (2001)Google Scholar
  27. 27.
    Wilson, T., Wiebe, J., Hoffmann, P.: Recognizing contextual polarity in phrase-level sentiment analysis. In: HLT 2005: Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Morristown, NJ, USA, pp. 347–354 (2005)Google Scholar
  28. 28.
    Witten, I.H., Frank, E.: Data Mining: Practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann, San Francisco (2005)zbMATHGoogle Scholar
  29. 29.
    Yi, J., Nasukawa, T., Bunescu, R., Niblack, W.: Sentiment analyzer: extracting sentiments about a given topic using natural language processing techniques. In: Proceedings of the Third IEEE International Conference on Data Mining, pp. 427–434 (November 2003)Google Scholar
  30. 30.
    Zhang, M., Ye, X.: A generation model to unify topic relevance and lexicon-based sentiment for opinion retrieval. In: SIGIR ’08: Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, pp. 411–418. ACM, York (2008)CrossRefGoogle Scholar
  31. 31.
    Zhuang, L., Jing, F., Zhu, X.Y.: Movie review mining and summarization. In: Proceedings of the 15th ACM International Conference on Information and Knowledge Management (CIKM 2006), pp. 43–50. ACM, New York (2006)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Björn Schuller
    • 1
  • Tobias Knaup
    • 2
  1. 1.Institute for Human-Machine CommunicationTechnische Universität MünchenGermany
  2. 2.Pingsta Inc.Redwood CityUSA

Personalised recommendations