Skip to main content

Active Learning to Remove Source Instances for Domain Adaptation for Word Sense Disambiguation

  • Conference paper
  • First Online:

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 593))

Abstract

In this paper, an active learning method of domain adaptation issues for word sense disambiguation is presented. In general, active learning is an approach where data with high learning effect is selected from an unlabeled data set, then labeled manually, and added to the training data. However, data in the source domain can deteriorate classification precision (misleading data), which extends errors to the domain adaptation. When data labeled by active learning is added to training data, an attempt is made to detect misleading data in the source domain and delete it from the training data. In this way, compared to standard learning classification precision is improved.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    The word “(Hairu)” has three senses in a dictionary. However, it has four senses in OC and PB domain. The fourth sense is new. In Japanese WSD SemEval-2 task, tagging the new sense was attempted.

  2. 2.

    http://www.csie.ntu.edu.tw/~cjlin/libsvm/.

References

  1. Chapelle, O., Schölkopf, B., Zien, A., et al.: Semi-supervised learning, vol. 2. MIT press, Cambridge (2006)

    Book  Google Scholar 

  2. Daumé, III, H.: Frustratingly easy domain adaptation. In: ACL-2007, pp. 256–263 (2007)

    Google Scholar 

  3. Jiang, J., Zhai, C.: Instance weighting for domain adaptation in NLP. In: ACL-2007, pp. 264–271 (2007)

    Google Scholar 

  4. Maekawa, K.: Design of a balanced corpus of contemporary written Japanese. In: Symposium on Large-Scale Knowledge Resources (LKR 2007), pp. 55–58 (2007)

    Google Scholar 

  5. Mori, S.: Domain adaptation in natural language processing (in japanese). Jpn. Soc. Artif. Intell. 27(4), 365–372 (2012)

    Google Scholar 

  6. Okumura, M., Shirai, K., Komiya, K., Yokono, H.: SemEval-2010 task: Japanese WSD. In: The 5th International Workshop on Semantic Evaluation, pp. 69–74 (2010)

    Google Scholar 

  7. Pan, S.J., Yang, Q.: A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 22(10), 1345–1359 (2010)

    Article  Google Scholar 

  8. Rai, P., Saha, A., Daumé III., H., Venkatasubramanian, S.: Domain adaptation meets active learning. In: NAACL HLT 2010 Workshop on Active Learning for Natural Language Processing, pp. 27–32 (2010)

    Google Scholar 

  9. Schohn, G., Cohn, D.: Less is more: Active learning with support vector machines. In: ICML, pp. 839–846 (2000)

    Google Scholar 

  10. Settles, B.: Active Learning Literature Survey. University of Wisconsin, Madison (2010)

    Google Scholar 

  11. Søgaard, A.: Semi-Supervised Learning and Domain Adaptation in Natural Language Processing. Morgan & Claypool, Milton Keynes (2013)

    Google Scholar 

  12. Sugiyama, M., Kawanabe, M.: Machine Learning in Non-Stationary Environments: Introduction to Covariate Shift Adaptation. MIT Press, Cambridge (2011)

    Google Scholar 

  13. Yoshida, H., Shinnou, H.: Detection of misleading data by outlier detection methods (in japanese). In: The 5th Japanese Corpus Linguistics Workshop, pp. 49–56 (2014)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hiroyuki Shinnou .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer Science+Business Media Singapore

About this paper

Cite this paper

Shinnou, H., Onodera, Y., Sasaki, M., Komiya, K. (2016). Active Learning to Remove Source Instances for Domain Adaptation for Word Sense Disambiguation. In: Hasida, K., Purwarianti, A. (eds) Computational Linguistics. PACLING 2015. Communications in Computer and Information Science, vol 593. Springer, Singapore. https://doi.org/10.1007/978-981-10-0515-2_7

Download citation

  • DOI: https://doi.org/10.1007/978-981-10-0515-2_7

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-10-0514-5

  • Online ISBN: 978-981-10-0515-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics