Abstract
In Chinese, frequent words, which always contain no significant information for information extraction, play an important role in the grammar structure of sentences. But the grammar information of these words is always ignored in Chinese segmentation. In this paper, for Chinese chunking, we dsesign an experiment to integrate the grammar information of frequent words and investigate the effect of this information on the chunking. We use conditional random fields for chunking, and rewrite the frequent words in the corpus to make them contain sentence structure information. The results show that the grammar information of frequent words, the number of which can be very small, can significantly increase the accuracy of the Chinese chunking.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Merlo, P., Stevenson, S.: Structure and Frequency in Verb Classification. In: The Thirtieth Incontro di Grammatica Generativa, Venice, Italy, pp. 43–61 (2004)
Abney, S.: Parsing by Chunks. In: Principle-Based Parsing, pp. 257–278. Kluwer Academic Publishers, Dordrecht (1991)
Zhou, G.D., Su, J., Tey, T.G.: Hybrid text chunking. In: Proceedings of the CoNLL 2000, pp. 163–165. Association for Computational, Lisbon (2000)
Lafferty, M.J., Pereira, F.: Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In: Proc. ICML 2001, pp. 282–289. Williamstown, MA (2001)
Li, H., Webster, J.J., Kit, C., Yao, T.: Transductive hmm based Chinese text chunking. In: Proceedings of IEEE NLPKE 2003, Beijing, China, pp. 257–262 (2003)
Tan, Y., Yao, T., Chen, Q., Zhu, Q.: Chinese chunk identification using svms plus sigmoid. In: Su, K.-Y., Tsujii, J., Lee, J.-H., Kwong, O.Y. (eds.) IJCNLP 2004. LNCS (LNAI), vol. 3248, pp. 527–536. Springer, Heidelberg (2004)
McCallum, A.: Efficiently Inducing Features of Conditional Random Fields. In: Proceedings of Conference on Uncertainty in Articifical Intelligence (UAI), Acapulco, Mexico (2003)
Chen, W., Zhang, Y., Isahara, Hitoshi: An empirical study of chinese chunking. In: COLING/ACL 2006 (Poster Sessions), Sydney, Australia (2006)
Christopher, D., Manning, Hinrich, S.: Foundations of Statistical Natural Language Processing. MIT Press, Cambridge (1999)
CRFsuiteVer.0.53, http://www.chokkan.org/software/crfsuite/tutorial.html
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Qi, Q., Liu, L., Chen, Y. (2010). Frequent Words’ Grammar Information in Chinese Chunking. In: Cai, Z., Hu, C., Kang, Z., Liu, Y. (eds) Advances in Computation and Intelligence. ISICA 2010. Lecture Notes in Computer Science, vol 6382. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-16493-4_44
Download citation
DOI: https://doi.org/10.1007/978-3-642-16493-4_44
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-16492-7
Online ISBN: 978-3-642-16493-4
eBook Packages: Computer ScienceComputer Science (R0)