Conditional Random Fields Based Label Sequence and Information Feedback
Part-of-speech (POS) tagging and shallow parsing are sequence modeling problems. While HMM and other generative models are not the most appropriate for the task of labeling sequential data. Compared with HMM, Maximum Entropy Markov models (MEMM) and other discriminative finite-state models can easily fused more features, however they suffer from the label bias problem. This paper presents a method of Chinese POS tagging and shallow parsing based on conditional random fields (CRF), as new discriminative sequential models, which may incorporate many rich features and well avoid the label bias problem. Moreover, we propose the information feedback from syntactical analysis to lexical analysis, since natural language should be a multi-knowledge interaction in nature. Experiments show that CRF approach achieves 0.70% F-score improvement in POS tagging and 0.67% improvement in shallow parsing. And we also confirm the effectiveness of information feedback to some complicated multi-class words.
Unable to display preview. Download preview PDF.