Skip to main content

Improving Word Sense Disambiguation by Pseudo-samples

  • Conference paper
  • 1576 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3248))

Abstract

Data sparseness is a major problem in word sense disambiguation. Automatic sample acquisition and smoothing are two ways that have been explored to alleviate the influence of data sparseness. In this paper, we consider a combination of these two methods. Firstly, we propose a pattern-based way to acquire pseudo samples, and then we estimate conditional probabilities for variables by combining pseudo data set with sense tagged data set. By using the combinational estimation, we build an appropriate leverage between the two different data sets, which is vital to achieve the best performance. Experiments show that our approach brings significant improvement for Chinese word sense disambiguation.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Agirre, E., Martinez, D.: Exploring Automatic Word Sense Disambiguation With Decision Lists and the Web. In: Proceedings of the Semantic Annotation And Intelligent Annotation workshop organized by COLING, Luxembourg (2000)

    Google Scholar 

  2. Diab, M., Resnik, P.: An Unsupervised Method for Word Sense Tagging using Parallel Corpora. In: Proceedings of ACL2002, pp. 255–262 (2002)

    Google Scholar 

  3. Zhendong Dong (2000), http://www.keenage.com/

  4. Gale, W.W., Church, K.W., Yarowsky, D.: A Method for Disambiguating Word Senses in a Large Corpus. Computers and Humanities 26, 415–439 (1992)

    Article  Google Scholar 

  5. Ide, N., Veronis, J.: Introduction to the Special Issue on Word Sense Disambiguation: The State of the Art. Computational Linguistics 24(1), 1–40 (1998)

    Google Scholar 

  6. Karov, Y., Edelman, S.: Similarity-based Word Sense Disambiguation. Computational Linguistics 24(1), 41–59 (1998)

    Google Scholar 

  7. Leacook, C., Chodorow, M., Miller, G.A.: Using Corpus Statistics and WordNet Relations for Sense Identification. Computational Linguistics 24(1), 147–166 (1998)

    Google Scholar 

  8. Li, C., Li, H.: Word Translation Disambiguation Using Bilingual Bootstrapping. In: Proceedings of ACL 2002, pp. 343–351 (2002)

    Google Scholar 

  9. Luk, A.K.: Statistical sense disambiguation with relatively small corpora using dictionary definition. In: Proceedings of ACL 1995, pp. 181–188 (1995)

    Google Scholar 

  10. Mihalcea, R., Moldovan, D.: An Automatic Method for Generating Sense Tagged Corpora. In: Proceedings of AAAI 1999, Orlando, FL, July 1999, pp. 461–466 (1999)

    Google Scholar 

  11. Mihalcea, R.: Bootstrapping Large Sense Tagged Corpora. In: Proceedings of the 3rd International Conference on Languages Resources and Evaluations LREC 2002, Las Palmas, Spain (May 2002)

    Google Scholar 

  12. Ng, H.T.: Exemplar-Based Word Sense Disambiguation: Some Recent Improvements. In: Proceedings of the Second Conference on Empirical Methods in Natural Language Processing, Providence, Rhode Island, USA, pp. 208–213 (1997)

    Google Scholar 

  13. Yarowsky, D.: Unsupervised Word Sense Disambiguation Rivaling Supervised Method. In: Proceedings of ACL 1995, pp. 189–196 (1995)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Wang, X., Matsumoto, Y. (2005). Improving Word Sense Disambiguation by Pseudo-samples. In: Su, KY., Tsujii, J., Lee, JH., Kwong, O.Y. (eds) Natural Language Processing – IJCNLP 2004. IJCNLP 2004. Lecture Notes in Computer Science(), vol 3248. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30211-7_41

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-30211-7_41

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-24475-2

  • Online ISBN: 978-3-540-30211-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics