A Two-View CoTraining Rule Induction System for Information Extraction
Information extraction is becoming an important task due to the vast growth of the online texts. Pattern rule induction is one kind of main methods to do information extraction. Manually constructing pattern rules is tedious and error prone. In this paper, we present GRID_CoTrain, a weakly supervised paradigm by bootstrapping GRID (a supervised rule induction system) with co-training and active learning. We also utilize external knowledge resource such as WordNet and existing ontology knowledge to optimize the learned pattern rules.
Unable to display preview. Download preview PDF.