Abstract
Because Chinese language has many unique characteristics, this leads to many conventional machine learning approaches of word POS tagging to be unsatisfactory and inefficient when applying to Chinese POS tagging task. Accordingly, the performance of the state-of-the-art Chinese word POS tagging is not as good as that for other languages such as English. In this paper, we present a novel Chinese word POS tagging method. We first assume that the character-based Chinese sentences are segmented to words completely before hand and phonetic notation - pinyin of these corresponding words is also accomplished. Then, we employ Markov Logic Networks (MLNs) to identify the Chinese word POS tags. It can describe or represent easily and flexibly these rich Chinese grammar structure. Furthermore, for considering the requirements of experiment and contrast, we build two benchmark datasets i.e., dataset1 and dataset2. Both datasets correspond to two different sentence types. The dataset1 and dataset2 consist of short sentences and long ones, respectively. Consequently, the experimental results demonstrate that our approach enhances significantly the state-of-the-art performance compared with other POS tagging methods on these datasets with different sentence types.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Hoifung, P., Pedro, D.: Joint Inference in Information Extraction. In: The Twenty-Second National Conference on Artificial Intelligence (2007)
Hoifung, P., Pedro, D.: Joint Unsupervised Coreference Resolution with Markov Logic. In: The Conference on Empirical Methods in Natural Language Processing (2008)
Domingos, P., Kok, S., Lowd, D., Poon, H., Richardson, M., Singla, P.: Markov Logic. In: De Raedt, L., Frasconi, P., Kersting, K., Muggleton, S.H. (eds.) Probabilistic ILP 2007. LNCS (LNAI), vol. 4911, pp. 92–117. Springer, Heidelberg (2008)
Hoifung, P., Colin, C., Kristina, T.: Unsupervised Morphological Segmentation with Log-Linear Models. In: The North American Chapter of the Association for Computational Linguistics-Human Language Technologies Conference (2009)
Hoifung, P., Pedro, D.: Unsupervised Ontology Induction from Text. In: The Annual Meeting of the Association for Computational Linguistics (2010)
Hoifung, P., Lucy, V.: Joint Inference for Knowledge Extraction from Biomedical Literature. In: The North American Chapter of the Association for Computational Linguistics (NAACL-HLT 2010) (2010)
Hoifung, P., Pedro, D.: Unsupervised Semantic Parsing. In: The Conference on Empirical Methods in Natural Language Processing (2011)
Matthew, R., Pedro, D.: Markov Logic Networks. Machine Learning 62(1-2), 107–136 (2006)
Liao, Z., Zhang, Z.: A Generalized Joint Inference Approach for Citation Matching. In: Wobcke, W., Zhang, M. (eds.) AI 2008. LNCS (LNAI), vol. 5360, pp. 601–607. Springer, Heidelberg (2008)
Liao, Z., Zhang, Z., Liu, Y.: Chinese Named Entity Recognition Based on Hierarchical Hybrid Model. In: Zhang, B.-T., Orgun, M.A. (eds.) PRICAI 2010. LNCS, vol. 6230, pp. 620–624. Springer, Heidelberg (2010)
Zhang, H., Yu, H., Xiong, D., Liu, Q.: HHMM-based Chinese Lexical Analyzer ICTCLAS. In: The Second SIGHAN Workshop Affiliated with 41st ACL, Sapporo, Japan, pp. 184–187 (2003)
Chen, K.: A Model for Robust Chinese Parser. Computational Linguistics and Chinese Language Processing 1(1), 183–204 (1996)
Xia, X., Wu, D.: Parsing Chinese with an Almost-Context-Free Grammar. In: The Conference on Empirical Methods in Natural Language Processing, Philadelphia, pp. 13–23 (1996)
Zhang, H., Liu, Q., Zhang, K., Zou, G., Bai, S.: Statistical Chinese Parser ICTPROP. Technology Report, Institute of Computing Technology, Beijing, China (2003)
Zhang, Q.: A Statistics-based Chinese Parser. In: The Fifth Workshop on Very Large Corpora, pp. 4–15 (1997)
Riedel, S., Chun, H., Takagi, T., Tsujii, J.: A Markov Logic Approach to Bio-Molecular Event Extraction. In: The Workshop on BioNLP: Shared Task, Boulder, Colorado, pp. 41–49 (2009)
Yu, S., Duan, H., Zou, X., Sun, B.: Specification for Modern Chinese Corpus Basic Processing at Peking University. Chinese Information 5, 49–64 (2002)
Yu, S., Duan, H., Zou, X., Sun, B., Chang, B.: Specification for Corpus Processing at Peking University: Word Segmentation, POS Tagging and Phonetic Notation. Chinese Language and Computing 13(2), 121–158 (2003)
Zhang, Y., Clark, S.: Syntactic Processing Using the Generalized Perceptron and Beam Search. Computational Linguistics 37(1), 105–151 (2011)
Sun, W.: A Stacked Sub-word Model for Joint Chinese Word Segmentation and part-of-speech Tagging. In: The Conference of Association for Computational Linguistics, Portland, Oregon, United States (2011)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Liao, Z., Zeng, Q., Wang, Q. (2015). Chinese Word POS Tagging with Markov Logic. In: Chau, M., Wang, G., Chen, H. (eds) Intelligence and Security Informatics. PAISI 2015. Lecture Notes in Computer Science(), vol 9074. Springer, Cham. https://doi.org/10.1007/978-3-319-18455-5_7
Download citation
DOI: https://doi.org/10.1007/978-3-319-18455-5_7
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-18454-8
Online ISBN: 978-3-319-18455-5
eBook Packages: Computer ScienceComputer Science (R0)