Skip to main content

Cross-Lingual Sentiment Classification via Bi-view Non-negative Matrix Tri-Factorization

  • Conference paper
Advances in Knowledge Discovery and Data Mining (PAKDD 2011)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6634))

Included in the following conference series:

Abstract

Recently the sentiment classification problem interests the researchers over the world, but most sentiment corpora are in English, which limits the research progress on sentiment classification in other languages. Cross-lingual sentiment classification aims to use annotated sentiment corpora in one language (e.g. English) as training data, to predict the sentiment polarity of the data in another language (e.g. Chinese). In this paper, we design a bi-view non-negative matrix tri-factorization (BNMTF) model for the cross-lingual sentiment classification problem. We employ machine translation service so that both training and test data is able to have two representation, one in source language and the other in target language. Our BNMTF model is derived from the non-negative matrix tri-factorization models in both languages in order to make more accurate prediction. Our BNMTF model has three main advantages: (1) combining the information from two views (2) incorporating the lexical knowledge and training document label knowledge (3) adding information from test documents. Experimental results show the effectiveness of our BNMTF model, which can outperform other baseline approaches to cross-lingual sentiment classification.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Banea, C., Mihalcea, R., Wiebe, J., Hassan, S.: Multilingual subjectivity analysis using machine translation. In: EMNLP, pp. 127–135. Association for Computational Linguistics, Morristown (2008)

    Chapter  Google Scholar 

  2. Bel, N., Koster, C.H.A., Villegas, M.: Cross-lingual text categorization. In: Koch, T., Sølvberg, I.T. (eds.) ECDL 2003. LNCS, vol. 2769, pp. 126–139. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  3. Blitzer, J., Dredze, M., Pereira, F.: Biographies, bollywood, boom-boxes and blenders: domain adaptation for sentiment classification. In: ACL, pp. 440–447. Association for Computational Linguistics, Morristown (2007)

    Google Scholar 

  4. Ding, C., Li, T., Peng, W., Park, H.: Orthogonal nonnegative matrix tri-factorizations for clustering. In: KDD, pp. 126–135. ACM, New York (2006)

    Google Scholar 

  5. Hofmann, T.: Probabilistic latent semantic indexing. In: SIGIR, pp. 50–57. ACM, New York (1999)

    Google Scholar 

  6. Kim, S.M., Hovy, E.: Determining the sentiment of opinions. In: COLING, p. 1367. Association for Computational Linguistics, Morristown (2004)

    Chapter  Google Scholar 

  7. Lee, D.D., Seung, H.S.: Algorithms for non-negative matrix factorization. In: NIPS, The MIT Press, Cambridge (2000)

    Google Scholar 

  8. Li, T., Zhang, Y., Sindhwani, V.: A non-negative matrix tri-factorization approach to sentiment classification with lexical prior knowledge. In: ACL, pp. 244–252. Association for Computational Linguistics, Morristown (2009)

    Google Scholar 

  9. Ling, X., Xue, G.R., Dai, W., Jiang, Y., Yang, Q., Yu, Y.: Can Chinese web pages be classified with English data source? In: WWW, pp. 969–978. ACM, New York (2008)

    Chapter  Google Scholar 

  10. Mihalcea, R., Banea, C., Wiebe, J.: Learning multilingual subjective language via cross-lingual projections. In: ACL, pp. 976–983. Association for Computational Linguistics, Morristown (2007)

    Google Scholar 

  11. Olsson, J.S., Oard, D.W., Hajič, J.: Cross-language text classification. In: SIGIR, pp. 645–646. ACM, New York (2005)

    Google Scholar 

  12. Pan, S.J., Ni, X., Sun, J.t., Yang, Q., Chen, Z.: Cross-domain sentiment classification via spectral feature alignment. In: WWW, pp. 751–760. ACM, New York (2010)

    Google Scholar 

  13. Pang, B., Lee, L., Vaithyanathan, S.: Thumbs up? Sentiment classification using machine learning techniques. In: EMNLP, pp. 79–86. Association for Computational Linguistics, Morristown (2002)

    Google Scholar 

  14. Turney, P.D.: Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews. In: ACL, pp. 417–424. Association for Computational Linguistics, Morristown (2002)

    Google Scholar 

  15. Wan, X.: Using bilingual knowledge and ensemble techniques for unsupervised chinese sentiment analysis. In: EMNLP, pp. 553–561. Association for Computational Linguistics, Morristown (2008)

    Chapter  Google Scholar 

  16. Wan, X.: Co-Training for cross-lingual sentiment classification. In: ACL, pp. 235–243. Association for Computational Linguistics, Morristown (2009)

    Google Scholar 

  17. Wang, X., Broder, A., Gabrilovich, E., Josifovski, V., Pang, B.: Cross-language query classification using web search for exogenous knowledge. In: WSDM, pp. 74–83. ACM, New York (2009)

    Chapter  Google Scholar 

  18. Yogatama, D., Tanaka-Ishii, K.: Multilingual spectral clustering using document similarity propagation. In: EMNLP, pp. 871–879. Association for Computational Linguistics, Morristown (2009)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Pan, J., Xue, GR., Yu, Y., Wang, Y. (2011). Cross-Lingual Sentiment Classification via Bi-view Non-negative Matrix Tri-Factorization. In: Huang, J.Z., Cao, L., Srivastava, J. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2011. Lecture Notes in Computer Science(), vol 6634. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-20841-6_24

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-20841-6_24

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-20840-9

  • Online ISBN: 978-3-642-20841-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics