Abstract
Supersense tagging classifies unknown words into semantic categories defined by lexicographers and inserts them into a thesaurus. Previous studies on supersense tagging show that context-based methods perform well for English unknown words while structure-based methods perform well for Chinese unknown words. The challenge before us is how to successfully combine contextual and structural information together for supersense tagging of Chinese unknown words. We propose a simple yet effective approach to address the challenge. In this approach, contextual information is used for measuring contextual similarity between words while structural information is used to filter candidate synonyms and adjusting contextual similarity score. Experiment results show that the proposed approach outperforms the state-of-art context-based method and structure-based method.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Fellbaum, C. (ed.): WordNet: An Electronic Lexical Database. MIT Press, Cambridge (1998)
Clark, S., Weir, D.: Class-based probability estimation using a semantic hierarchy. Computational Linguistics 28(2), 187–206 (2002)
Ponzetto, S.P., Strube, M.: Exploiting Semantic Role Labeling, WordNet and Wikipedia for Coreference Resolution. In: Proceedings of the Human Language Technology Conference of the North American Chapter of the ACL, pp. 192–199 (2006)
Herrera, J., Peñas, A., Verdejo, F.: Textual Entailment Recognition Based on Dependency Analysis and WordNet Machine Learning Challenges, pp. 231–239. Springer, Heidelberg (2006)
Esuli, A., Sebastiani, F.: PageRanking WordNet Synsets: An Application to Opinion Mining. In: Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics, pp. 424–431 (2007)
Ciaramita, M., Johnson, M.: Supersense Tagging of Unknown Nouns in WordNet. In: Proceedings of the 2003 Conference on Empirical Methods on Natural Language Processing, pp. 168–175 (2003)
Chen, K., Chen, C.: Automatic semantic classification for Chinese unknown compound nouns. In: Proceedings of the 18th International Conference on Computational Linguistics, pp. 173–179 (2000)
Curran, J.R.: Supersense Tagging of Unknown Nouns using Semantic Similarity. In: Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics, pp. 26–33 (2005)
Tseng, H.: Semantic classification of Chinese unknown words. In: Proceedings of ACL-2003 Student Research Workshop, pp. 72–79 (2003)
Lu, X.: Hybrid Models for Semantic Classification of Chinese Unknown Words. In: Proceedings of North American Chapter of the Association for Computational Linguistics - Human Language Technologies 2007 Conference, pp. 188–195 (2007)
Chen, H., Lin, C.: Sense-tagging Chinese Corpus. In: Proceedings of the 2nd Chinese Language Processing Workshop, pp. 7–14 (2000)
Chen, C.: Character-sense association and compounding template similarity: Automatic semantic classification of Chinese compounds. In: Proceedings of the 3rd SIGHAN Workshop on Chinese Language Processing, pp. 33–40 (2004)
Widdows, D.: Unsupervised methods for developing taxonomies by combining syntactic and statistical information. In: Proceedings of the Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics, Edmonton, Alberta Canada, pp. 276–283 (2003)
Pekar, V., Staab, S.: Word classification based on combined measures of distributional and semantic similarity. In: Proceedings of 10th Conference of the European Chapter of the Association for Computational Linguistics, pp. 147–150 (2003)
Salton, G., Buckley, C.: Term weighting approaches in automatic text retrieval. Information Processing and Management 24(5), 513–523 (1988)
Kim, J., Li, J., Lee, J.: Discovering the Discriminative Views: Measuring Term Weights for Sentiment Analysis. In: Proceedings of the 47th Annual Meeting of the Association of Computational Linguistics, pp. 253–261 (2009)
Qiu, L., Hu, C., Zhao, K.: A method for automatic POS guessing of Chinese unknown words. In: Proceedings of the 22nd International Conference on Computational Linguistics, pp. 705–712 (2008)
Hudson, R.: Word Grammar. Basil Blackwell Publishers Limited., Oxford (1984)
Mei, J., Zhu, Y., Gao, Y., Yin, H. (eds.): Tongyici Cilin. Commercial Press, Hong Kong (1984)
Yu, S., Duan, H., Zhu, X., Swen, B.: The basic processing of Contemporary Chinese Corpus at Peking University. Journal of Chinese Information Processing 16(5), 49–64 (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Qiu, L., Wu, Y., Shao, Y. (2011). Combining Contextual and Structural Information for Supersense Tagging of Chinese Unknown Words. In: Gelbukh, A.F. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2011. Lecture Notes in Computer Science, vol 6608. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-19400-9_2
Download citation
DOI: https://doi.org/10.1007/978-3-642-19400-9_2
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-19399-6
Online ISBN: 978-3-642-19400-9
eBook Packages: Computer ScienceComputer Science (R0)