Improving Word Sense Disambiguation by Pseudo-samples

Wang, Xiaojie; Matsumoto, Yuji

doi:10.1007/978-3-540-30211-7_41

Improving Word Sense Disambiguation by Pseudo-samples

Xiaojie Wang^22,23 &
Yuji Matsumoto²²

Conference paper

1576 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3248))

Abstract

Data sparseness is a major problem in word sense disambiguation. Automatic sample acquisition and smoothing are two ways that have been explored to alleviate the influence of data sparseness. In this paper, we consider a combination of these two methods. Firstly, we propose a pattern-based way to acquire pseudo samples, and then we estimate conditional probabilities for variables by combining pseudo data set with sense tagged data set. By using the combinational estimation, we build an appropriate leverage between the two different data sets, which is vital to achieve the best performance. Experiments show that our approach brings significant improvement for Chinese word sense disambiguation.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Agirre, E., Martinez, D.: Exploring Automatic Word Sense Disambiguation With Decision Lists and the Web. In: Proceedings of the Semantic Annotation And Intelligent Annotation workshop organized by COLING, Luxembourg (2000)
Google Scholar
Diab, M., Resnik, P.: An Unsupervised Method for Word Sense Tagging using Parallel Corpora. In: Proceedings of ACL2002, pp. 255–262 (2002)
Google Scholar
Zhendong Dong (2000), http://www.keenage.com/
Gale, W.W., Church, K.W., Yarowsky, D.: A Method for Disambiguating Word Senses in a Large Corpus. Computers and Humanities 26, 415–439 (1992)
Article Google Scholar
Ide, N., Veronis, J.: Introduction to the Special Issue on Word Sense Disambiguation: The State of the Art. Computational Linguistics 24(1), 1–40 (1998)
Google Scholar
Karov, Y., Edelman, S.: Similarity-based Word Sense Disambiguation. Computational Linguistics 24(1), 41–59 (1998)
Google Scholar
Leacook, C., Chodorow, M., Miller, G.A.: Using Corpus Statistics and WordNet Relations for Sense Identification. Computational Linguistics 24(1), 147–166 (1998)
Google Scholar
Li, C., Li, H.: Word Translation Disambiguation Using Bilingual Bootstrapping. In: Proceedings of ACL 2002, pp. 343–351 (2002)
Google Scholar
Luk, A.K.: Statistical sense disambiguation with relatively small corpora using dictionary definition. In: Proceedings of ACL 1995, pp. 181–188 (1995)
Google Scholar
Mihalcea, R., Moldovan, D.: An Automatic Method for Generating Sense Tagged Corpora. In: Proceedings of AAAI 1999, Orlando, FL, July 1999, pp. 461–466 (1999)
Google Scholar
Mihalcea, R.: Bootstrapping Large Sense Tagged Corpora. In: Proceedings of the 3rd International Conference on Languages Resources and Evaluations LREC 2002, Las Palmas, Spain (May 2002)
Google Scholar
Ng, H.T.: Exemplar-Based Word Sense Disambiguation: Some Recent Improvements. In: Proceedings of the Second Conference on Empirical Methods in Natural Language Processing, Providence, Rhode Island, USA, pp. 208–213 (1997)
Google Scholar
Yarowsky, D.: Unsupervised Word Sense Disambiguation Rivaling Supervised Method. In: Proceedings of ACL 1995, pp. 189–196 (1995)
Google Scholar

Download references

Author information

Authors and Affiliations

Graduate School of Information Science, Nara Institute of Science and Technology, 8916-5 Takayama, Ikoma, Nara, 630-0192, Japan
Xiaojie Wang & Yuji Matsumoto
School of Information Engineering, Beijing University of Posts and Technology, Beijing, 100876, China
Xiaojie Wang

Authors

Xiaojie Wang
View author publications
You can also search for this author in PubMed Google Scholar
Yuji Matsumoto
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Behavior Design Corporation, IV Science-Based Industrial Park Hsinchu, 2F, No.5, Industry E. Rd, Taiwan
Keh-Yih Su
University of Tokyo, Hongo 7-3-1, Bunkyo-ku, Tokyo 113-0033, JST CREST, Honcho 4-1-8, Kawaguchi-shi,, 332-0012, Saitama,
Jun’ichi Tsujii
Pohang University of Science and Technology (POSTECH), AITrc, Republic of Korea
Jong-Hyeok Lee
Language Information Sciences Research Centre, City University of Hong Kong, Tat Chee Avenue, Kowloon, Hong Kong
Oi Yee Kwong

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wang, X., Matsumoto, Y. (2005). Improving Word Sense Disambiguation by Pseudo-samples. In: Su, KY., Tsujii, J., Lee, JH., Kwong, O.Y. (eds) Natural Language Processing – IJCNLP 2004. IJCNLP 2004. Lecture Notes in Computer Science(), vol 3248. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30211-7_41

Download citation

DOI: https://doi.org/10.1007/978-3-540-30211-7_41
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-24475-2
Online ISBN: 978-3-540-30211-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics