Skip to main content

Cross-Domain Opinion Word Identification with Query-By-Committee Active Learning

  • Conference paper
Technologies and Applications of Artificial Intelligence (TAAI 2014)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8916))

Abstract

Opinion word identification (OWI). is an important task for opinion mining. In OWI, it is necessary to find the exact positions of opinion word mentions. Supervised learning approaches can locate such mentions with high accuracy. To construct an OWI system for a new domain, it is necessary to annotate sufficient amounts of data to represent the new domain’s characteristics. However, since annotating every new domain extensively is costly, how to best utilize existing annotated data is a very important challenge for mention-based OWI systems. In this work, we propose a cross-domain OWI system. The query by committee (QBC) active learning scheme is used to select controlled amounts of data in the new domain for manual annotation. This new annotated data is used to complement the existing annotated data of the original domain. We compile three annotated datasets, each for one of three different domains, and conduct domain adaptation experiments on all six domain pairs. Our experiments show that by adding only 1,000 newly annotated sentences from the new domain to the existing annotated data, our system can achieve nearly the same level of accuracy as a system trained on 10,000 annotated new-domain sentences. Our system with the QBC active learning scheme also outperforms the same system with a random selection scheme.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Aue, A., Gamon, M.: Customizing Sentiment Classifiers to New Domains: A Case Study. In: Proceedings of Recent Advances in Natural Language Processing (RANLP) (2005)

    Google Scholar 

  2. Bollegala, D., Weir, D., Carroll, J.: Using multiple sources to construct a sentiment sensitive thesaurus for cross-domain sentiment classification. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, HLT 2011, vol. 1, pp. 132–141. Association for Computational Linguistics, Stroudsburg (2011)

    Google Scholar 

  3. Cambria, E., Speer, R., Havasi, C., Hussain, A.: Senticnet: A publicly available semantic resource for opinion mining. In: AAAI Fall Symposium: Commonsense Knowledge, volume FS-10-02 of AAAI Technical Report. AAAI (2010)

    Google Scholar 

  4. Jakob, N., Gurevych, I.: Extracting opinion targets in a single- and cross-domain setting with conditional random fields. In: Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, EMNLP 2010, pp. 1035–1045. Association for Computational Linguistics, Stroudsburg (2010)

    Google Scholar 

  5. Li, S., Xue, Y., Wang, Z., Zhou, G.: Active learning for cross-domain sentiment classification. In: Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence, IJCAI 2013, pp. 2127–2133. AAAI Press (2013)

    Google Scholar 

  6. Qiu, G., Liu, B., Bu, J., Chen, C.: Opinion word expansion and target extraction through double propagation. Comput. Linguist. 37(1), 9–27 (2011)

    Article  Google Scholar 

  7. Seung, H.S., Opper, M., Sompolinsky, H.: Query by committee. In: Proceedings of the Fifth Annual Workshop on Computational Learning Theory, COLT 1992, pp. 287–294. ACM, New York (1992)

    Chapter  Google Scholar 

  8. Shen, D., Zhang, J., Su, J., Zhou, G., Tan, C.-L.: Multi-criteria-based active learning for named entity recognition. In: Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics, ACL 2004. Association for Computational Linguistics, Stroudsburg (2004)

    Google Scholar 

  9. Tsai, A.C.-R., Wu, C.-E., Tsai, R.T.-H., Hsu, J.Y.J.: Building a concept-level sentiment dictionary based on commonsense knowledge. IEEE Intelligent Systems 28(2), 22–30 (2013)

    Article  Google Scholar 

  10. Wang, B., Wang, H.: Bootstrapping Both Product Features and Opinion Words from Chinese Customer Reviews with Cross-Inducing. In: Proceedings of the Third International Joint Conference on Natural Language Processing: Volume-I (2008), http://aclweb.org/anthology/I08-1038

  11. Yang, B., Cardie, C.: Extracting opinion expressions with semi-markov conditional random fields. In: Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, EMNLP-CoNLL 2012, pp. 1335–1345. Association for Computational Linguistics, Stroudsburg (2012)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Tsai, YL., Tsai, R.TH., Chueh, CH., Chang, SC. (2014). Cross-Domain Opinion Word Identification with Query-By-Committee Active Learning. In: Cheng, SM., Day, MY. (eds) Technologies and Applications of Artificial Intelligence. TAAI 2014. Lecture Notes in Computer Science(), vol 8916. Springer, Cham. https://doi.org/10.1007/978-3-319-13987-6_31

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-13987-6_31

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-13986-9

  • Online ISBN: 978-3-319-13987-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics