Skip to main content

Frequent Words’ Grammar Information in Chinese Chunking

  • Conference paper
Advances in Computation and Intelligence (ISICA 2010)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 6382))

Included in the following conference series:

  • 1640 Accesses

Abstract

In Chinese, frequent words, which always contain no significant information for information extraction, play an important role in the grammar structure of sentences. But the grammar information of these words is always ignored in Chinese segmentation. In this paper, for Chinese chunking, we dsesign an experiment to integrate the grammar information of frequent words and investigate the effect of this information on the chunking. We use conditional random fields for chunking, and rewrite the frequent words in the corpus to make them contain sentence structure information. The results show that the grammar information of frequent words, the number of which can be very small, can significantly increase the accuracy of the Chinese chunking.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Merlo, P., Stevenson, S.: Structure and Frequency in Verb Classification. In: The Thirtieth Incontro di Grammatica Generativa, Venice, Italy, pp. 43–61 (2004)

    Google Scholar 

  2. Abney, S.: Parsing by Chunks. In: Principle-Based Parsing, pp. 257–278. Kluwer Academic Publishers, Dordrecht (1991)

    Google Scholar 

  3. Zhou, G.D., Su, J., Tey, T.G.: Hybrid text chunking. In: Proceedings of the CoNLL 2000, pp. 163–165. Association for Computational, Lisbon (2000)

    Google Scholar 

  4. Lafferty, M.J., Pereira, F.: Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In: Proc. ICML 2001, pp. 282–289. Williamstown, MA (2001)

    Google Scholar 

  5. Li, H., Webster, J.J., Kit, C., Yao, T.: Transductive hmm based Chinese text chunking. In: Proceedings of IEEE NLPKE 2003, Beijing, China, pp. 257–262 (2003)

    Google Scholar 

  6. Tan, Y., Yao, T., Chen, Q., Zhu, Q.: Chinese chunk identification using svms plus sigmoid. In: Su, K.-Y., Tsujii, J., Lee, J.-H., Kwong, O.Y. (eds.) IJCNLP 2004. LNCS (LNAI), vol. 3248, pp. 527–536. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  7. McCallum, A.: Efficiently Inducing Features of Conditional Random Fields. In: Proceedings of Conference on Uncertainty in Articifical Intelligence (UAI), Acapulco, Mexico (2003)

    Google Scholar 

  8. Chen, W., Zhang, Y., Isahara, Hitoshi: An empirical study of chinese chunking. In: COLING/ACL 2006 (Poster Sessions), Sydney, Australia (2006)

    Google Scholar 

  9. Christopher, D., Manning, Hinrich, S.: Foundations of Statistical Natural Language Processing. MIT Press, Cambridge (1999)

    MATH  Google Scholar 

  10. CRFsuiteVer.0.53, http://www.chokkan.org/software/crfsuite/tutorial.html

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Qi, Q., Liu, L., Chen, Y. (2010). Frequent Words’ Grammar Information in Chinese Chunking. In: Cai, Z., Hu, C., Kang, Z., Liu, Y. (eds) Advances in Computation and Intelligence. ISICA 2010. Lecture Notes in Computer Science, vol 6382. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-16493-4_44

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-16493-4_44

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-16492-7

  • Online ISBN: 978-3-642-16493-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics