Skip to main content

Active Learning on Sentiment Classification by Selecting Both Words and Documents

  • Conference paper
Book cover Chinese Lexical Semantics (CLSW 2012)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7717))

Included in the following conference series:

Abstract

Currently, sentiment analysis has become a hot research topic in the natural language processing (NLP) field as it is highly valuable for many real applications.. One basic task in sentiment analysis is sentiment classification which aims to predict the sentiment orientation (positive or negative) of a document. Current approaches to this problem are mainly based on supervised machine learning technologies. The main drawback of such approaches lies in their needs of large amounts of labeled data. How to reduce the annotation cost has become an important issue in sentiment classification. In this study, we propose a novel active learning approach to select both "informative" word and document samples for annotation. Experimental results show that our approach apparently outperforms random selection or uncertainty sampling on documents.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Pang, B., Lee, L., Vaithyanathan, S.: Thumbs up? Sentiment Classification using Machine Learning Techniques. In: Proceedings of EMNLP 2002, pp. 79–86 (2002)

    Google Scholar 

  2. Li, S., Zong, C.: Multi-domain Sentiment Classification (short paper). In: Proceedings of ACL 2008, pp. 257–260 (2008)

    Google Scholar 

  3. Melville, P., Gryc, W., Lawrence, R.: Sentiment Analysis of Blogs by Combining Lexical Knowledge with Text Classification. In: Proceedings of KDD 2009, pp. 1275–1284 (2009)

    Google Scholar 

  4. Pang, B., Lee, L.: A Sentimental Education: Sentiment Analysis using Subjectivity Summarization based on Minimum Cuts. In: Proceedings of ACL 2004, pp. 271–278 (2004)

    Google Scholar 

  5. Riloff, E., Patwardhan, S., Wiebe, J.: Feature Subsumption for Opinion Analysis. In: Proceedings of EMNLP 2006, pp. 440–448 (2006)

    Google Scholar 

  6. McDonald, R., Hannan, K., Neylon, T., Wells, M., Reynar, J.: Structured Models for Fine-to-coarse Sentiment Analysis. In: Proceedings of ACL 2007, pp. 432–439 (2007)

    Google Scholar 

  7. Cui, H., Mittal, V., Datar, M.: Comparative Experiments on Sentiment Classification for Online Product Reviews. In: Proceedings of AAAI 2006, pp. 1265–1270 (2006)

    Google Scholar 

  8. Li, S., Huang, C., Zong, C.: Multi-domain Sentiment Classification with Classifier Combination. Journal of Computer Science and Technology (JCST) 26(1), 25–33 (2011)

    Article  Google Scholar 

  9. Li, S., Lee, S., Chen, Y., Huang, C., Zhou, G.: Sentiment Classification and Polarity Shifting. In: Proceeding of COLING 2010, pp. 635–643 (2010b)

    Google Scholar 

  10. Li, S., Huang, C., Zhou, G., Lee, S.: Employing Personal/Impersonal Views in Supervised and Semi-supervised Sentiment Classification. In: Proceedings of ACL 2010, pp. 414–423 (2010a)

    Google Scholar 

  11. Pang, B., Lee, L.: Opinion Mining and Sentiment Analysis: Foundations and Trends. Information Retrieval 2(12), 1–135 (2008)

    Google Scholar 

  12. Hatzivassiloglou, V., McKeown, K.: Predicting the Semantic Orientation of Adjectives. In: Proceedings of ACL 1997, pp. 174–181 (1997)

    Google Scholar 

  13. Wiebe, J.: Learning Subjective Adjectives from Corpora. In: Proceedings of AAAI 2000 (2000)

    Google Scholar 

  14. McCallum, A., Nigam, K.: Employing EM in pool-based active learning for text classification. In: Proceedings of ICML 1998, pp. 350–358 (1998)

    Google Scholar 

  15. Long, J., Yin, J., Zhu, E., Zhao, W.: Active learning research. Research and Development of Computer 45, 300–304 (2008)

    Google Scholar 

  16. Roy, N., McCallum, A.: Toward Optimal Active Learning through Sampling Estimation of Error Reduction. In: Proceedings of ICML 2001, pp. 441–448 (2001)

    Google Scholar 

  17. Lewis, D., Gale, W.: Training Text Classifiers by Uncertainty Sampling. In: Proceedings of SIGIR 1994, pp. 3–12 (1994)

    Google Scholar 

  18. Argamon-Engleson, S., Dagan, I.: Committee-Based Sample Selection For Probabilistic Classifiers. Journal of Artificial Intelligence Research, 335–360 (1999)

    Google Scholar 

  19. Melville, P., Sindhwani, V.: Active Dual Supervision: Reducing the Cost of Annotating Examples and Features. In: Proceedings of NAACL 2009, pp. 49–57 (2009)

    Google Scholar 

  20. Sindhwani, V., Melville, P.: Document-Word Co-Regularization for Semi-supervised Sentiment Analysis. In: Proceedings of ICDM 2008, pp. 1025–1030 (2008)

    Google Scholar 

  21. Sindhwani, V., Hu, J., Mojsilovic, A.: Regularized co-clustering with dual supervision. In: NIPS, pp. 1505–1512 (2008)

    Google Scholar 

  22. Zong, C.: Statistical natural language processing. Tsinghua University Publishing (2008)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Ju, S., Li, S. (2013). Active Learning on Sentiment Classification by Selecting Both Words and Documents. In: Ji, D., Xiao, G. (eds) Chinese Lexical Semantics. CLSW 2012. Lecture Notes in Computer Science(), vol 7717. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-36337-5_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-36337-5_6

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-36336-8

  • Online ISBN: 978-3-642-36337-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics