Skip to main content

Active Learning for Cross-Lingual Sentiment Classification

  • Conference paper
Book cover Natural Language Processing and Chinese Computing (NLPCC 2013)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 400))

Abstract

Cross-lingual sentiment classification aims to predict the sentiment orientation of a text in a language (named as the target language) with the help of the resources from another language (named as the source language). However, current cross-lingual performance is normally far away from satisfaction due to the huge difference in linguistic expression and social culture. In this paper, we suggest to perform active learning for cross-lingual sentiment classification, where only a small scale of samples are actively selected and manually annotated to achieve reasonable performance in a short time for the target language. The challenge therein is that there are normally much more labeled samples in the source language than those in the target language. This makes the small amount of labeled samples from the target language flooded in the aboundance of labeled samples from the source language, which largely reduces their impact on cross-lingual sentiment classification. To address this issue, we propose a data quality controlling approach in the source language to select high-quality samples from the source language. Specifically, we propose two kinds of data quality measurements, intra- and extra-quality measurements, from the certainty and similarity perspectives. Empirical studies verify the appropriateness of our active learning approach to cross-lingual sentiment classification.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Balahur, A., Turchi, M.: Multilingual Sentiment Analysis using Machine Translation? In: Proceedings of the 3rd Workshop on Computational Approaches to Subjectivity and Sentiment Analysis, pp. 52–60 (2012)

    Google Scholar 

  2. Blitzer, J., Dredze, M., Pereira, F.: Biographies, Bollywood, Boom-boxes and Blenders: Domain Adaptation for Sentiment Classification. In: Proceedings of ACL 2007, pp. 440–447 (2007)

    Google Scholar 

  3. Boyd-Graber, J., Resnik, P.: Holistic Sentiment Analysis across Languages Multilingual Supervised Latent Dirichlet Allocation. In: Proceedings of ACL 2010, pp. 45–55 (2010)

    Google Scholar 

  4. Kohavi, R.: A Study of Cross-validation and Bootstrp for Accuracy Estimation and Model Selection. In: Proceedings of IJCAI, pp. 1137–1143 (1995)

    Google Scholar 

  5. Liu, B.: Sentiment Analysis and Opinion Mining (Introduction and Survey). Morgan & Claypool Publishers (May 2012)

    Google Scholar 

  6. Lu, B., Tan, C., Cardie, C., Tsou, B.: Joint Bilingual Sentiment Classification with Unlabeled Parallel Corpora. In: Proceedings of ACL 2011, pp. 320–330 (2011)

    Google Scholar 

  7. Pang, B., Lee, L.: Opinion Mining and Sentiment Analysis: Foundations and Trends. Information Retrieval 2(12), 1–135 (2008)

    Google Scholar 

  8. Pang, B., Lee, L., Vaithyanathan, S.: Thumbs up? Sentiment Classification using Machine Learning Techniques. In: Proceedings of EMNLP 2002, pp. 79–86 (2002)

    Google Scholar 

  9. Prettenhofer, P., Stein, B.: Cross Language Text Classification Using Structural Correspondence Learning. In: Proceedings of ACL 2010, pp. 1118–1127 (2010)

    Google Scholar 

  10. Turney, P.: Thumbs up or Thumbs down? Semantic Orientation Applied to Unsupervised Classification of reviews. In: Proceedings of ACL 2002, pp. 417–424 (2002)

    Google Scholar 

  11. Wan, X.: Using Bilingual Knowledge and Ensemble Techniques for Unsupervised Chinese Sentiment Analysis. In: Proceedings of ACL 2008, pp. 553–561 (2008)

    Google Scholar 

  12. Wan, X.: Co-Training for Cross-Lingual Sentiment Classification. In: Proceedings of ACL 2009, pp. 235–243 (2009)

    Google Scholar 

  13. Wan, X.: Bilingual Co-Training for Sentiment Classification of Chinese Product Reviews. Computational Linguistics 37, 587–616 (2011)

    Article  Google Scholar 

  14. Wei, B., Pal, C.: Cross Lingual Adaptation An Experiment on Sentiment Classifications. In: Proceedings of ACL 2010, pp. 258–262 (2010)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Li, S., Wang, R., Liu, H., Huang, CR. (2013). Active Learning for Cross-Lingual Sentiment Classification. In: Zhou, G., Li, J., Zhao, D., Feng, Y. (eds) Natural Language Processing and Chinese Computing. NLPCC 2013. Communications in Computer and Information Science, vol 400. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-41644-6_22

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-41644-6_22

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-41643-9

  • Online ISBN: 978-3-642-41644-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics