Skip to main content

Ontology-Driven Information Extraction from Research Publications

  • Conference paper
  • First Online:
Digital Libraries for Open Knowledge (TPDL 2018)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11057))

Included in the following conference series:

Abstract

Extraction of information from a research article, association with other sources and inference of new knowledge is a challenging task that has not yet been entirely addressed. We present Research Spotlight, a system that leverages existing information from DBpedia, retrieves articles from repositories, extracts and interrelates various kinds of named and non-named entities by exploiting article metadata, the structure of text as well as syntactic, lexical and semantic constraints, and populates a knowledge base in the form of RDF triples. An ontology designed to represent scholarly practices is driving the whole process. The system is evaluated through two experiments that measure the overall accuracy in terms of token- and entity- based precision, recall and F1 scores, as well as entity boundary detection, with promising results.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 64.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 84.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://orcid.org/.

  2. 2.

    https://www.crummy.com/software/BeautifulSoup/.

  3. 3.

    https://spacy.io/.

  4. 4.

    https://nlp.stanford.edu/software/CRF-NER.html.

References

  1. Jurafsky, D., Martin, J.H.: Speech and language processing - an introduction to natural language processing, computational linguistics, and speech recognition (2017)

    Google Scholar 

  2. Pertsas, V., Constantopoulos, P.: Scholarly ontology: modelling scholarly practices. Int. J. Digit. Libr. 18, 173–190 (2017). https://doi.org/10.1007/s00799-016-0169-3

    Article  Google Scholar 

  3. Gerber, D., Hellmann, S., Bühmann, L., Soru, T., Usbeck, R., Ngonga Ngomo, A.-C.: Real-time RDF extraction from unstructured data streams. In: Alani, H., et al. (eds.) ISWC 2013. LNCS, vol. 8218, pp. 135–150. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-41335-3_9

    Chapter  Google Scholar 

  4. Lehmann, J., et al.: DBpedia - a large-scale, multilingual knowledge base extracted from Wikipedia. Semant. Web 6, 167–195 (2015). https://doi.org/10.3233/SW-140134

    Article  Google Scholar 

  5. Zimmermann, A., Gravier, C., Subercaze, J., Cruzille, Q.: Nell2RDF: read the web, and turn it into RDF. In: CEUR Workshop Proceedings, pp. 1–7 (2013)

    Google Scholar 

  6. Stern, R., Sagot, B.: Population of a knowledge base for news metadata from unstructured text and web data. In: AKBC-WEKEX 2012, pp. 35–40, Montreal, Canada (2012)

    Google Scholar 

  7. Alani, H., et al.: Automatic ontology-based knowledge extraction from web documents. IEEE Intell. Syst. 18, 14–21 (2003)

    Article  Google Scholar 

  8. Makki, J., Alquier, A.-M., Prince, V.: Ontology population via NLP techniques in risk management. Int. J. Humanit. Soc. Sci. 3, 212–217 (2008)

    Google Scholar 

  9. Celjuska, D., Vargas-Vera, M.: Ontosophie: a semi-automatic system for ontology population from text. In: ICON 2004 (2004)

    Google Scholar 

  10. Buitelaar, P., Cimiano, P., Frank, A., Hartung, M., Racioppa, S.: Ontology-based information extraction and integration from heterogeneous data sources. Int. J. Hum.-Comput. Stud. 66, 759–788 (2008). https://doi.org/10.1016/j.ijhcs.2008.07.007

    Article  Google Scholar 

  11. Pertsas, V.: Modeling and extracting research processes. Athens University of Economics and Business, Athens (2018)

    Google Scholar 

  12. Manning, C.D., Raghavan, P., Schutze, H.: Introduction to Information Retrieval. Cambridge University Press, Cambridge (2008)

    Book  Google Scholar 

  13. De Sitter, A., Calders, T., Daelemans, W.: A formal framework for evaluation of information extraction, University of Antwerp (2004)

    Google Scholar 

  14. Maynard, D., Peters, W., Li, Y.: Metrics for evaluation of ontology based information extraction. In: WWW 2006 Workshop on Evaluation of Ontologies for the Web (2006)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Vayianos Pertsas .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Pertsas, V., Constantopoulos, P. (2018). Ontology-Driven Information Extraction from Research Publications. In: Méndez, E., Crestani, F., Ribeiro, C., David, G., Lopes, J. (eds) Digital Libraries for Open Knowledge. TPDL 2018. Lecture Notes in Computer Science(), vol 11057. Springer, Cham. https://doi.org/10.1007/978-3-030-00066-0_21

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-00066-0_21

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-00065-3

  • Online ISBN: 978-3-030-00066-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics