Ontology-Driven Information Extraction from Research Publications

Pertsas, Vayianos; Constantopoulos, Panos

doi:10.1007/978-3-030-00066-0_21

Vayianos Pertsas¹⁸ &
Panos Constantopoulos^18,19

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11057))

Included in the following conference series:

International Conference on Theory and Practice of Digital Libraries

1708 Accesses
2 Citations

Abstract

Extraction of information from a research article, association with other sources and inference of new knowledge is a challenging task that has not yet been entirely addressed. We present Research Spotlight, a system that leverages existing information from DBpedia, retrieves articles from repositories, extracts and interrelates various kinds of named and non-named entities by exploiting article metadata, the structure of text as well as syntactic, lexical and semantic constraints, and populates a knowledge base in the form of RDF triples. An ontology designed to represent scholarly practices is driving the whole process. The system is evaluated through two experiments that measure the overall accuracy in terms of token- and entity- based precision, recall and F1 scores, as well as entity boundary detection, with promising results.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 64.99; Price excludes VAT (USA)

Softcover Book: USD 84.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

References

Jurafsky, D., Martin, J.H.: Speech and language processing - an introduction to natural language processing, computational linguistics, and speech recognition (2017)
Google Scholar
Pertsas, V., Constantopoulos, P.: Scholarly ontology: modelling scholarly practices. Int. J. Digit. Libr. 18, 173–190 (2017). https://doi.org/10.1007/s00799-016-0169-3
Article Google Scholar
Gerber, D., Hellmann, S., Bühmann, L., Soru, T., Usbeck, R., Ngonga Ngomo, A.-C.: Real-time RDF extraction from unstructured data streams. In: Alani, H., et al. (eds.) ISWC 2013. LNCS, vol. 8218, pp. 135–150. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-41335-3_9
Chapter Google Scholar
Lehmann, J., et al.: DBpedia - a large-scale, multilingual knowledge base extracted from Wikipedia. Semant. Web 6, 167–195 (2015). https://doi.org/10.3233/SW-140134
Article Google Scholar
Zimmermann, A., Gravier, C., Subercaze, J., Cruzille, Q.: Nell2RDF: read the web, and turn it into RDF. In: CEUR Workshop Proceedings, pp. 1–7 (2013)
Google Scholar
Stern, R., Sagot, B.: Population of a knowledge base for news metadata from unstructured text and web data. In: AKBC-WEKEX 2012, pp. 35–40, Montreal, Canada (2012)
Google Scholar
Alani, H., et al.: Automatic ontology-based knowledge extraction from web documents. IEEE Intell. Syst. 18, 14–21 (2003)
Article Google Scholar
Makki, J., Alquier, A.-M., Prince, V.: Ontology population via NLP techniques in risk management. Int. J. Humanit. Soc. Sci. 3, 212–217 (2008)
Google Scholar
Celjuska, D., Vargas-Vera, M.: Ontosophie: a semi-automatic system for ontology population from text. In: ICON 2004 (2004)
Google Scholar
Buitelaar, P., Cimiano, P., Frank, A., Hartung, M., Racioppa, S.: Ontology-based information extraction and integration from heterogeneous data sources. Int. J. Hum.-Comput. Stud. 66, 759–788 (2008). https://doi.org/10.1016/j.ijhcs.2008.07.007
Article Google Scholar
Pertsas, V.: Modeling and extracting research processes. Athens University of Economics and Business, Athens (2018)
Google Scholar
Manning, C.D., Raghavan, P., Schutze, H.: Introduction to Information Retrieval. Cambridge University Press, Cambridge (2008)
Book Google Scholar
De Sitter, A., Calders, T., Daelemans, W.: A formal framework for evaluation of information extraction, University of Antwerp (2004)
Google Scholar
Maynard, D., Peters, W., Li, Y.: Metrics for evaluation of ontology based information extraction. In: WWW 2006 Workshop on Evaluation of Ontologies for the Web (2006)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Informatics, Athens University of Economics and Business, Athens, Greece
Vayianos Pertsas & Panos Constantopoulos
Digital Curation Unit, Athena Research Centre, Athens, Greece
Panos Constantopoulos

Authors

Vayianos Pertsas
View author publications
You can also search for this author in PubMed Google Scholar
Panos Constantopoulos
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Vayianos Pertsas .

Editor information

Editors and Affiliations

University Carlos III, Madrid, Spain
Eva Méndez
USI, Università della Svizzera Italiana, Lugano, Switzerland
Fabio Crestani
INESC TEC, Faculty of Engineering, University of Porto, Porto, Portugal
Cristina Ribeiro
INESC TEC, Faculty of Engineering, University of Porto, Porto, Portugal
Gabriel David
INESC TEC, Faculty of Engineering, University of Porto, Porto, Portugal
João Correia Lopes

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Pertsas, V., Constantopoulos, P. (2018). Ontology-Driven Information Extraction from Research Publications. In: Méndez, E., Crestani, F., Ribeiro, C., David, G., Lopes, J. (eds) Digital Libraries for Open Knowledge. TPDL 2018. Lecture Notes in Computer Science(), vol 11057. Springer, Cham. https://doi.org/10.1007/978-3-030-00066-0_21

Download citation

DOI: https://doi.org/10.1007/978-3-030-00066-0_21
Published: 05 September 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-00065-3
Online ISBN: 978-3-030-00066-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics