Extracting and Querying Relations in Scientific Papers

Schäfer, Ulrich; Uszkoreit, Hans; Federmann, Christian; Marek, Torsten; Zhang, Yajing

doi:10.1007/978-3-540-85845-4_16

Extracting and Querying Relations in Scientific Papers

Ulrich Schäfer¹,
Hans Uszkoreit¹,
Christian Federmann¹,
Torsten Marek¹ &
…
Yajing Zhang¹

Conference paper

1129 Accesses
3 Citations

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5243))

Abstract

High-precision linguistic and semantic analysis of scientific texts is an emerging research area. We describe methods and an application for extracting interesting factual relations from scientific texts in computational linguistics and language technology. We use a hybrid NLP architecture with shallow preprocessing for increased robustness and domain-specific, ontology-based named entity recognition, followed by a deep HPSG parser running the English Resource Grammar (ERG). The extracted relations in the MRS (minimal recursion semantics) format are simplified and generalized using WordNet. The resulting ‘quriples’ are stored in a database from where they can be retrieved by relation-based search. The query interface is embedded in a web browser-based application we call the Scientist’s Workbench. It supports researchers in editing and online-searching scientific papers.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Bird, S., Dale, R., Dorr, B., Gibson, B., Joseph, M., Kan, M.Y., Lee, D., Powley, B., Radev, D., Tan, Y.F.: The ACL anthology reference corpus: a reference dataset for bibliographic research. In: Proc. of LREC, Marrakech, Morocco (2008)
Google Scholar
Schäfer, U.: Integrating Deep and Shallow Natural Language Processing Components – Representations and Hybrid Architectures. PhD thesis, Faculty of Mathematics and Computer Science, Saarland University, Saarbrücken, Germany (2007)
Google Scholar
Brants, T.: TnT - A Statistical Part-of-Speech Tagger. In: Proc. of Eurospeech, Rhodes, Greece (2000)
Google Scholar
Drożdżyński, W., Krieger, H.U., Piskorski, J., Schäfer, U., Xu, F.: Shallow processing with unification and typed feature structures – foundations and applications. Künstliche Intelligenz 2004(1), 17–23 (2004)
Google Scholar
Callmeier, U.: PET – A platform for experimentation with efficient HPSG processing techniques. Natural Language Engineering 6(1), 99–108 (2000)
Article Google Scholar
Copestake, A., Flickinger, D.: An open-source grammar development environment and broad-coverage English grammar using HPSG. In: Proc. of LREC, Athens, Greece, pp. 591–598 (2000)
Google Scholar
Oepen, S., Flickinger, D., Toutanova, K., Manning, C.D.: LinGO redwoods: A rich and dynamic treebank for HPSG. In: Proc. of the Workshop on Treebanks and Linguistic Theories, TLT 2002, Sozopol, Bulgaria, September 20–21 (2002)
Google Scholar
Copestake, A., Flickinger, D., Sag, I.A., Pollard, C.: Minimal recursion semantics: an introduction. Journal of Research on Language and Computation 3(2–3) (2005)
Google Scholar
Uszkoreit, H., Jörg, B., Erbach, G.: An ontology-based knowledge portal for language technology. In: Proc. of ENABLER/ELSNET Workshop, Paris (2003)
Google Scholar
Schäfer, U.: OntoNERdIE – mapping and linking ontologies to named entity recognition and information extraction resources. In: Proc. of LREC, Genoa, Italy (2006)
Google Scholar
Miller, G.A., Beckwith, R., Fellbaum, C., Gross, D., Miller, K.J.: Five papers on WordNet. Technical report, Cognitive Science Lab, Princeton University (1993)
Google Scholar
Rupp, C., Copestake, A., Corbett, P., Waldron, B.: Integrating general-purpose and domain-specific components in the analysis of scientific text. In: Proc. of the UK e-Science Programme All Hands Meeting 2007, Nottingham, UK (2007)
Google Scholar
Sætre, R., Kenji, S., Tsujii, J.: Syntactic features for protein-protein interaction extraction. In: Baker, C.J., Jian, S. (eds.) Short Paper Proc. of the 2nd International Symposium on Languages in Biology and Medicine (LBM 2007), Singapore (2008)
Google Scholar

Download references

Author information

Authors and Affiliations

German Research Center for Artificial Intelligence (DFKI), Language Technology Lab, Campus D 3 1, Stuhlsatzenhausweg 3, D-66123, Saarbrücken, Germany
Ulrich Schäfer, Hans Uszkoreit, Christian Federmann, Torsten Marek & Yajing Zhang

Authors

Ulrich Schäfer
View author publications
You can also search for this author in PubMed Google Scholar
Hans Uszkoreit
View author publications
You can also search for this author in PubMed Google Scholar
Christian Federmann
View author publications
You can also search for this author in PubMed Google Scholar
Torsten Marek
View author publications
You can also search for this author in PubMed Google Scholar
Yajing Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Andreas R. Dengel Karsten Berns Thomas M. Breuel Frank Bomarius Thomas R. Roth-Berghofer

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Schäfer, U., Uszkoreit, H., Federmann, C., Marek, T., Zhang, Y. (2008). Extracting and Querying Relations in Scientific Papers. In: Dengel, A.R., Berns, K., Breuel, T.M., Bomarius, F., Roth-Berghofer, T.R. (eds) KI 2008: Advances in Artificial Intelligence. KI 2008. Lecture Notes in Computer Science(), vol 5243. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-85845-4_16

Download citation

DOI: https://doi.org/10.1007/978-3-540-85845-4_16
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-85844-7
Online ISBN: 978-3-540-85845-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics