Abstract
Relation extraction from textual documents is an important task in the context of information extraction. This task aims at identifying relations between pairs of named entities and assigning them a type. Relation extraction is often approached as a supervised classification problem, involving pre-processing steps such as text segmentation, entity recognition, and morphological and syntactic annotations. In previous studies, the way data is pre-processed differs among them, thus making the comparison of classification techniques for relation extraction unfair and inconclusive. Some of these classification techniques for relation extraction involve the use of kernels, which enable the comparison of complex structures. We propose a benchmark for the comparison of different kernels for relation extraction. Specifically, we propose the application of a common pre-processing stage, together with the use of an online learning algorithm to train Support Vector Machines with kernels designed for the classification of candidate pairs of related entities. We also report the results of the systematic experimental validation we have performed, using well known datasets in the area.
Keywords
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
- 2.
- 3.
- 4.
- 5.
- 6.
- 7.
- 8.
- 9.
- 10.
- 11.
References
Barrio, P., Simões, G., Galhardas, H., Gravano, L.: REEL: a relation extraction learning framework. In: JCDL (2014)
Berger, A.L., Pietra, V.J.D., Pietra, S.A.D.: A maximum entropy approach to natural language processing. Comput. Linguist. 22, 39–71 (1996)
Bunescu, R., Mooney, R.J.: A shortest path dependency kernel for relation extraction. In: HLT-EMNLP (2005)
Bunescu, R., Mooney, R.J.: Subsequence kernels for relation extraction. In: CoNLL (2006)
Chinchor, N.A.: Named entity task definition. In: MUC-7 (1998)
Doddington, G.R., et al.: The automatic content extraction (ACE) program - tasks, data, and evaluation. In: LREC (2004)
Giuliano, C., Lavelli, A., Romano, L.: Exploiting shallow linguistic information for relation extraction from biomedical literature. In: EACL (2006)
Hendrickx, I., et al.: SemEval-2010 task 8: multi-way classification of semantic relations between pairs of nominals. In: SemEval (2010)
Hsu, C.W., Lin, C.J.: A comparison of methods for multiclass support vector machines. IEEE Trans. Neural Netw. 13, 415–425 (2002)
Marrero, M., Sanchez-Cuadrado, S., Lara, J.M., Andreadakis, G.: Evaluation of named entity extraction systems. Res. Comput. Sci. 41, 47–58 (2009)
Sarawagi, S.: Information extraction. Found. Trends Databases 1, 261–377 (2008)
Shalev-Shwartz, S., Singer, Y., Srebro, N.: PEGASOS: primal estimated sub-GrAdient SOlver for SVM. In: ICML (2007)
Acknowledgements
We would like to thank Gonçalo Simões for the fruitful discussions, and for advice on preliminary versions of this paper.
This work was supported by Fundação para a Ciência e a Tecnologia, under Project UID/CEC/50021/2013, and under Project DataStorm (ref. EXCL/EEI-ESS/0257/2012).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Pereira, J.L.M., Galhardas, H., Martins, B. (2015). A Benchmark for Relation Extraction Kernels. In: Tadeusz, M., Valduriez, P., Bellatreche, L. (eds) Advances in Databases and Information Systems. ADBIS 2015. Lecture Notes in Computer Science(), vol 9282. Springer, Cham. https://doi.org/10.1007/978-3-319-23135-8_13
Download citation
DOI: https://doi.org/10.1007/978-3-319-23135-8_13
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-23134-1
Online ISBN: 978-3-319-23135-8
eBook Packages: Computer ScienceComputer Science (R0)