Sentiment Analysis on Twitter through Topic-Based Lexicon Expansion

  • Zhixin Zhou
  • Xiuzhen Zhang
  • Mark Sanderson
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8506)


Supervised learning approaches are domain-dependent and it is costly to obtain labeled training data from different domains. Lexicon-based approaches enjoy stable performance across domains, but often cannot capture domain-dependent features. It is also hard for lexicon-based classifiers to identify the polarities of abbreviations and misspellings, which are common in short informal social text but usually not found in general sentiment lexicons. We propose to overcome this limitation by expanding a general lexicon with domain-dependent opinion words as well as abbreviations and informal opinion expressions. The expanded terms are automatically selected based on their mutual information with emoticons. As there is an abundant amount of emoticon-bearing tweets on Twitter, our approach provides a way to do domain-dependent sentiment analysis without the cost of data annotation. We show that our technique leads to statistically significant improvements in classification accuracies across 56 topics with a state-of-the-art lexicon-based classifier. We also present the expanded terms, and show the most representative opinion expressions obtained from co-occurrence with emoticons.


Sentiment Analysis Opinion Word Sentiment Lexicon Label Training Data Semantic Orientation 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Baccianella, S., et al.: SentiWordNet 3.0: An enhanced lexical resource for sentiment analysis and opinion mining. In: LREC (2010)Google Scholar
  2. 2.
    Becker, L., et al.: AVAYA: Sentiment analysis on twitter with self-training and polarity lexicon expansion. In: SemEval (2013)Google Scholar
  3. 3.
    Blitzer, J., et al.: Biographies, bollywood, boom-boxes and blenders: Domain adaptation for sentiment classification. In: ACL (2007)Google Scholar
  4. 4.
    Bonilla, E., et al.: Multi-task gaussian process prediction (2008)Google Scholar
  5. 5.
    Choi, Y., Cardie, C.: Adapting a polarity lexicon using integer linear programming for domain-specific sentiment classification. In: EMNLP (2009)Google Scholar
  6. 6.
    Davidov, D., et al.: Enhanced sentiment learning using twitter hashtags and smileys. In: Coling 2010 (2010)Google Scholar
  7. 7.
    Davis, J., Domingos, P.: Deep transfer via second-order markov logic. In: ICML (2009)Google Scholar
  8. 8.
    Go, A., et al.: Twitter sentiment classification using distant supervision. In: CS224N Project Report, Stanford (2009)Google Scholar
  9. 9.
    Liu, K.L., et al.: Emoticon smoothed language models for twitter sentiment analysis. In: AAAI (2012)Google Scholar
  10. 10.
    Ounis, I., et al.: Overview of the trec-2011 microblog track. In: TREC 2011 (2011)Google Scholar
  11. 11.
    Owoputi, O., et al.: Improved part-of-speech tagging for online conversational text with word clusters. In: Proceedings of NAACL-HLT (2013)Google Scholar
  12. 12.
    Pak, A., Paroubek, P.: Twitter as a corpus for sentiment analysis and opinion mining. In: LREC (2010)Google Scholar
  13. 13.
    Pan, S.J., et al.: Transfer learning via dimensionality reduction. In: AAAI (2008)Google Scholar
  14. 14.
    Pang, B., et al.: Thumbs up?: sentiment classification using machine learning techniques. In: EMNLP (2002)Google Scholar
  15. 15.
    Ponomareva, N., Thelwall, M.: Do neighbours help?: An exploration of graph-based algorithms for cross-domain sentiment classification. In: Proceedings of the 2012 Joint Conference on EMNLP and CoNLL (2012)Google Scholar
  16. 16.
    Taboada, M., et al.: Lexicon-based methods for sentiment analysis. Computational linguistics (2011)Google Scholar
  17. 17.
    Thelwall, M., Buckley, K.: Topic-based sentiment analysis for the social web: The role of mood and issue-related words. JASIST (2013)Google Scholar
  18. 18.
    Thelwall, M., et al.: Sentiment strength detection for the social web. JASIST (2012)Google Scholar
  19. 19.
    Thelwall, M., et al.: Sentiment strength detection in short informal text. JASIST (2010)Google Scholar
  20. 20.
    Turney, P.D.: Thumbs up or thumbs down?: semantic orientation applied to unsupervised classification of reviews. In: ACL (2002)Google Scholar
  21. 21.
    Zhang, D., et al.: Sentiment detection with auxiliary data. Information retrieval (2012)Google Scholar
  22. 22.
    Zhang, L., et al.: Combining lexiconbased and learning-based methods for twitter sentiment analysis. HP Laboratories, Technical Report HPL-2011 (2011)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Zhixin Zhou
    • 1
  • Xiuzhen Zhang
    • 1
  • Mark Sanderson
    • 1
  1. 1.Department of Computer Science and ITRMIT UniversityMelbourneAustralia

Personalised recommendations