Combining Contextual and Structural Information for Supersense Tagging of Chinese Unknown Words

Qiu, Likun; Wu, Yunfang; Shao, Yanqiu

doi:10.1007/978-3-642-19400-9_2

Likun Qiu¹⁷,
Yunfang Wu¹⁷ &
Yanqiu Shao¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 6608))

Included in the following conference series:

International Conference on Intelligent Text Processing and Computational Linguistics

2236 Accesses
2 Citations

Abstract

Supersense tagging classifies unknown words into semantic categories defined by lexicographers and inserts them into a thesaurus. Previous studies on supersense tagging show that context-based methods perform well for English unknown words while structure-based methods perform well for Chinese unknown words. The challenge before us is how to successfully combine contextual and structural information together for supersense tagging of Chinese unknown words. We propose a simple yet effective approach to address the challenge. In this approach, contextual information is used for measuring contextual similarity between words while structural information is used to filter candidate synonyms and adjusting contextual similarity score. Experiment results show that the proposed approach outperforms the state-of-art context-based method and structure-based method.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Fellbaum, C. (ed.): WordNet: An Electronic Lexical Database. MIT Press, Cambridge (1998)
MATH Google Scholar
Clark, S., Weir, D.: Class-based probability estimation using a semantic hierarchy. Computational Linguistics 28(2), 187–206 (2002)
Article MATH Google Scholar
Ponzetto, S.P., Strube, M.: Exploiting Semantic Role Labeling, WordNet and Wikipedia for Coreference Resolution. In: Proceedings of the Human Language Technology Conference of the North American Chapter of the ACL, pp. 192–199 (2006)
Google Scholar
Herrera, J., Peñas, A., Verdejo, F.: Textual Entailment Recognition Based on Dependency Analysis and WordNet Machine Learning Challenges, pp. 231–239. Springer, Heidelberg (2006)
Google Scholar
Esuli, A., Sebastiani, F.: PageRanking WordNet Synsets: An Application to Opinion Mining. In: Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics, pp. 424–431 (2007)
Google Scholar
Ciaramita, M., Johnson, M.: Supersense Tagging of Unknown Nouns in WordNet. In: Proceedings of the 2003 Conference on Empirical Methods on Natural Language Processing, pp. 168–175 (2003)
Google Scholar
Chen, K., Chen, C.: Automatic semantic classification for Chinese unknown compound nouns. In: Proceedings of the 18th International Conference on Computational Linguistics, pp. 173–179 (2000)
Google Scholar
Curran, J.R.: Supersense Tagging of Unknown Nouns using Semantic Similarity. In: Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics, pp. 26–33 (2005)
Google Scholar
Tseng, H.: Semantic classification of Chinese unknown words. In: Proceedings of ACL-2003 Student Research Workshop, pp. 72–79 (2003)
Google Scholar
Lu, X.: Hybrid Models for Semantic Classification of Chinese Unknown Words. In: Proceedings of North American Chapter of the Association for Computational Linguistics - Human Language Technologies 2007 Conference, pp. 188–195 (2007)
Google Scholar
Chen, H., Lin, C.: Sense-tagging Chinese Corpus. In: Proceedings of the 2nd Chinese Language Processing Workshop, pp. 7–14 (2000)
Google Scholar
Chen, C.: Character-sense association and compounding template similarity: Automatic semantic classification of Chinese compounds. In: Proceedings of the 3rd SIGHAN Workshop on Chinese Language Processing, pp. 33–40 (2004)
Google Scholar
Widdows, D.: Unsupervised methods for developing taxonomies by combining syntactic and statistical information. In: Proceedings of the Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics, Edmonton, Alberta Canada, pp. 276–283 (2003)
Google Scholar
Pekar, V., Staab, S.: Word classification based on combined measures of distributional and semantic similarity. In: Proceedings of 10th Conference of the European Chapter of the Association for Computational Linguistics, pp. 147–150 (2003)
Google Scholar
Salton, G., Buckley, C.: Term weighting approaches in automatic text retrieval. Information Processing and Management 24(5), 513–523 (1988)
Article Google Scholar
Kim, J., Li, J., Lee, J.: Discovering the Discriminative Views: Measuring Term Weights for Sentiment Analysis. In: Proceedings of the 47th Annual Meeting of the Association of Computational Linguistics, pp. 253–261 (2009)
Google Scholar
Qiu, L., Hu, C., Zhao, K.: A method for automatic POS guessing of Chinese unknown words. In: Proceedings of the 22nd International Conference on Computational Linguistics, pp. 705–712 (2008)
Google Scholar
Hudson, R.: Word Grammar. Basil Blackwell Publishers Limited., Oxford (1984)
Google Scholar
Mei, J., Zhu, Y., Gao, Y., Yin, H. (eds.): Tongyici Cilin. Commercial Press, Hong Kong (1984)
Google Scholar
Yu, S., Duan, H., Zhu, X., Swen, B.: The basic processing of Contemporary Chinese Corpus at Peking University. Journal of Chinese Information Processing 16(5), 49–64 (2002)
Google Scholar

Download references

Author information

Authors and Affiliations

Key Laboratory of Computational Linguistics, Ministry of Education, Peking University, 100871, Beijing, China
Likun Qiu & Yunfang Wu
Institute of Artificial Intelligence, Beijing City University, 100083, Beijing, China
Yanqiu Shao

Authors

Likun Qiu
View author publications
You can also search for this author in PubMed Google Scholar
Yunfang Wu
View author publications
You can also search for this author in PubMed Google Scholar
Yanqiu Shao
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Center for Computing Research, National Polytechnic Institute, Mexico
Alexander F. Gelbukh

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Qiu, L., Wu, Y., Shao, Y. (2011). Combining Contextual and Structural Information for Supersense Tagging of Chinese Unknown Words. In: Gelbukh, A.F. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2011. Lecture Notes in Computer Science, vol 6608. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-19400-9_2

Download citation

DOI: https://doi.org/10.1007/978-3-642-19400-9_2
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-19399-6
Online ISBN: 978-3-642-19400-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics