Skip to main content

Combining Contextual and Structural Information for Supersense Tagging of Chinese Unknown Words

  • Conference paper
Book cover Computational Linguistics and Intelligent Text Processing (CICLing 2011)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 6608))

Abstract

Supersense tagging classifies unknown words into semantic categories defined by lexicographers and inserts them into a thesaurus. Previous studies on supersense tagging show that context-based methods perform well for English unknown words while structure-based methods perform well for Chinese unknown words. The challenge before us is how to successfully combine contextual and structural information together for supersense tagging of Chinese unknown words. We propose a simple yet effective approach to address the challenge. In this approach, contextual information is used for measuring contextual similarity between words while structural information is used to filter candidate synonyms and adjusting contextual similarity score. Experiment results show that the proposed approach outperforms the state-of-art context-based method and structure-based method.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Fellbaum, C. (ed.): WordNet: An Electronic Lexical Database. MIT Press, Cambridge (1998)

    MATH  Google Scholar 

  2. Clark, S., Weir, D.: Class-based probability estimation using a semantic hierarchy. Computational Linguistics 28(2), 187–206 (2002)

    Article  MATH  Google Scholar 

  3. Ponzetto, S.P., Strube, M.: Exploiting Semantic Role Labeling, WordNet and Wikipedia for Coreference Resolution. In: Proceedings of the Human Language Technology Conference of the North American Chapter of the ACL, pp. 192–199 (2006)

    Google Scholar 

  4. Herrera, J., Peñas, A., Verdejo, F.: Textual Entailment Recognition Based on Dependency Analysis and WordNet Machine Learning Challenges, pp. 231–239. Springer, Heidelberg (2006)

    Google Scholar 

  5. Esuli, A., Sebastiani, F.: PageRanking WordNet Synsets: An Application to Opinion Mining. In: Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics, pp. 424–431 (2007)

    Google Scholar 

  6. Ciaramita, M., Johnson, M.: Supersense Tagging of Unknown Nouns in WordNet. In: Proceedings of the 2003 Conference on Empirical Methods on Natural Language Processing, pp. 168–175 (2003)

    Google Scholar 

  7. Chen, K., Chen, C.: Automatic semantic classification for Chinese unknown compound nouns. In: Proceedings of the 18th International Conference on Computational Linguistics, pp. 173–179 (2000)

    Google Scholar 

  8. Curran, J.R.: Supersense Tagging of Unknown Nouns using Semantic Similarity. In: Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics, pp. 26–33 (2005)

    Google Scholar 

  9. Tseng, H.: Semantic classification of Chinese unknown words. In: Proceedings of ACL-2003 Student Research Workshop, pp. 72–79 (2003)

    Google Scholar 

  10. Lu, X.: Hybrid Models for Semantic Classification of Chinese Unknown Words. In: Proceedings of North American Chapter of the Association for Computational Linguistics - Human Language Technologies 2007 Conference, pp. 188–195 (2007)

    Google Scholar 

  11. Chen, H., Lin, C.: Sense-tagging Chinese Corpus. In: Proceedings of the 2nd Chinese Language Processing Workshop, pp. 7–14 (2000)

    Google Scholar 

  12. Chen, C.: Character-sense association and compounding template similarity: Automatic semantic classification of Chinese compounds. In: Proceedings of the 3rd SIGHAN Workshop on Chinese Language Processing, pp. 33–40 (2004)

    Google Scholar 

  13. Widdows, D.: Unsupervised methods for developing taxonomies by combining syntactic and statistical information. In: Proceedings of the Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics, Edmonton, Alberta Canada, pp. 276–283 (2003)

    Google Scholar 

  14. Pekar, V., Staab, S.: Word classification based on combined measures of distributional and semantic similarity. In: Proceedings of 10th Conference of the European Chapter of the Association for Computational Linguistics, pp. 147–150 (2003)

    Google Scholar 

  15. Salton, G., Buckley, C.: Term weighting approaches in automatic text retrieval. Information Processing and Management 24(5), 513–523 (1988)

    Article  Google Scholar 

  16. Kim, J., Li, J., Lee, J.: Discovering the Discriminative Views: Measuring Term Weights for Sentiment Analysis. In: Proceedings of the 47th Annual Meeting of the Association of Computational Linguistics, pp. 253–261 (2009)

    Google Scholar 

  17. Qiu, L., Hu, C., Zhao, K.: A method for automatic POS guessing of Chinese unknown words. In: Proceedings of the 22nd International Conference on Computational Linguistics, pp. 705–712 (2008)

    Google Scholar 

  18. Hudson, R.: Word Grammar. Basil Blackwell Publishers Limited., Oxford (1984)

    Google Scholar 

  19. Mei, J., Zhu, Y., Gao, Y., Yin, H. (eds.): Tongyici Cilin. Commercial Press, Hong Kong (1984)

    Google Scholar 

  20. Yu, S., Duan, H., Zhu, X., Swen, B.: The basic processing of Contemporary Chinese Corpus at Peking University. Journal of Chinese Information Processing 16(5), 49–64 (2002)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Qiu, L., Wu, Y., Shao, Y. (2011). Combining Contextual and Structural Information for Supersense Tagging of Chinese Unknown Words. In: Gelbukh, A.F. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2011. Lecture Notes in Computer Science, vol 6608. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-19400-9_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-19400-9_2

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-19399-6

  • Online ISBN: 978-3-642-19400-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics