A Feature-Based Approach for Relation Extraction from Thai News Documents
Relation extraction among named entities is one of the most important tasks in information extraction. This paper presents a feature-based approach for extracting relations among named entities from Thai news documents. In this approach, shallow linguistic processing, including pattern-based named entity extraction, is performed to construct several sets of features. Four supervised learning schemes are applied alternatively to investigate the performance of relation extraction using different feature sets. Focusing on four different types of relations in crime-related news documents, the experimental result shows that the proposed method achieves up to an accuracy of 95% using a data set of 1736 entity pairs. Effect of each set of features on relation extraction is explored for further discussion.
KeywordsRelation Extraction Named Entity Extraction Thai Language Processing Supervised Learning Local Features
Unable to display preview. Download preview PDF.
- 1.Zhu, J., Gonçalves, A., Uren, V., Motta, E., Pacheco, R.: Corder: Community relation discovery by named entity recognition. In: Proceedings of the 3rd int’l conference on Knowledge capture (K-CAP 2005), pp. 219–220. ACM, New York (2005)Google Scholar
- 2.Hasegawa, T., Sekine, S., Grishman, R.: Discovering relations among named entities from large corpora. In: Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics (ACL 2004), Morristown, NJ, USA, ACL, p. 415 (2004)Google Scholar
- 4.Kawtrakul, A., Suktarachan, M., Varasai, P., Chanlekha, H.: A state of the art of thai language resources and thai language behavior analysis and modeling. In: Proceedings of the 3rd workshop on Asian language resources and int’l standardization (COLING 2002), Morristown, NJ, USA, ACL, pp. 1–8 (2002)Google Scholar
- 5.Tongtep, N., Theeramunkong, T.: Pattern-based named entity extraction for thai news documents. In: Proceedings of the 3rd Int’l Conference on Knowledge, Information and Creativity Support Systems (KICSS 2008), December 22-23, 2008, pp. 82–89 (2008)Google Scholar