Skip to main content

Automatic Extraction of Chinese V-N Collocations

  • Conference paper
Chinese Lexical Semantics (CLSW 2012)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7717))

Included in the following conference series:

Abstract

Chinese V-N collocations have two possible structural relations: verb-object relation and attributive-head relation. Both of them are widely used in Chinese language processing tasks, but long distance and low frequency collocations are often difficult to extract. A weighted mutual information (WMI) model and a rule-based method were designed to acquire V-N collocations by taking more syntactic structure features into consideration. The WMI model extracted verb-object collocation within clauses. It reduced the interference of illegal collocates and highlighted the weight of long distance collocates, by giving different weights to collocates in different locations. The rule-based method used part of speech patterns to extract verb-object and attributive-head collocations, and inferred implicit collocations. The experiments show that, the WMI model optimizes evaluation scores of long distance collocations, while the rule-based method is more accurate in extracting and distinguishing the two kinds of collocations, including low frequency collocations.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Church, K.W., Hanks, P.: Word Association Norms, Mutual Information, and Lexicography. Computational Linguistics 16(1), 22–29 (1990)

    Google Scholar 

  2. Smadja, F.: Retrieving Collocations from Text: Xtract. Computational Linguistics 19(1), 143–177 (1993)

    Google Scholar 

  3. Sun, M.S.: Quantitative Analysis of the Chinese Collocations. Studies of Chinese Language 256(1), 29–38 (1997)

    Google Scholar 

  4. Lin, D.K.: Extracting Collocations from Text Corpora. In: Proceedings of 1st Workshop on Computational Terminology, pp. 57–63. MIT Press, Montreal (1998)

    Google Scholar 

  5. Bai, M.Q., Zheng, J.H.: Study on Ways of Verb-Verb Collocation. Computer Engineering and Applications 40(27), 70–72 (2004)

    Google Scholar 

  6. Wang, X.: A Study on the Automatic Acquisition of Verb-object Collocations in Chinese. Applied Linguistics (1), 137–143 (2005)

    Google Scholar 

  7. Zhu, D.X.: “de” Phrase and Judgment Sentence (“的”字结构和判断句). Studies of Chinese Language (1), 23–27 (1978)

    Google Scholar 

  8. Zhu, D.X.: “de” Phrase and Judgment Sentence (“的”字结构和判断句). Studies of Chinese Language (2), 104–109 (1978)

    Google Scholar 

  9. Yu, S.W., Duan, H.M., Zhu, X.F., Sun, B.: The Basic Processing of Contemporary Chinese Corpus at Peking University Specification. Journal of Chinese Information Processing 16(5), 49–64 (2002)

    Google Scholar 

  10. Yu, S.W., Duan, H.M., Zhu, X.F., Sun, B.: The Basic Processing of Contemporary Chinese Corpus at Peking University Specification. Journal of Chinese Information Processing (continued) 16(6), 58–64 (2002)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Qian, X. (2013). Automatic Extraction of Chinese V-N Collocations. In: Ji, D., Xiao, G. (eds) Chinese Lexical Semantics. CLSW 2012. Lecture Notes in Computer Science(), vol 7717. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-36337-5_24

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-36337-5_24

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-36336-8

  • Online ISBN: 978-3-642-36337-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics