Skip to main content

Extracting and Querying Relations in Scientific Papers

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5243))

Abstract

High-precision linguistic and semantic analysis of scientific texts is an emerging research area. We describe methods and an application for extracting interesting factual relations from scientific texts in computational linguistics and language technology. We use a hybrid NLP architecture with shallow preprocessing for increased robustness and domain-specific, ontology-based named entity recognition, followed by a deep HPSG parser running the English Resource Grammar (ERG). The extracted relations in the MRS (minimal recursion semantics) format are simplified and generalized using WordNet. The resulting ‘quriples’ are stored in a database from where they can be retrieved by relation-based search. The query interface is embedded in a web browser-based application we call the Scientist’s Workbench. It supports researchers in editing and online-searching scientific papers.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   89.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bird, S., Dale, R., Dorr, B., Gibson, B., Joseph, M., Kan, M.Y., Lee, D., Powley, B., Radev, D., Tan, Y.F.: The ACL anthology reference corpus: a reference dataset for bibliographic research. In: Proc. of LREC, Marrakech, Morocco (2008)

    Google Scholar 

  2. Schäfer, U.: Integrating Deep and Shallow Natural Language Processing Components – Representations and Hybrid Architectures. PhD thesis, Faculty of Mathematics and Computer Science, Saarland University, Saarbrücken, Germany (2007)

    Google Scholar 

  3. Brants, T.: TnT - A Statistical Part-of-Speech Tagger. In: Proc. of Eurospeech, Rhodes, Greece (2000)

    Google Scholar 

  4. Drożdżyński, W., Krieger, H.U., Piskorski, J., Schäfer, U., Xu, F.: Shallow processing with unification and typed feature structures – foundations and applications. Künstliche Intelligenz 2004(1), 17–23 (2004)

    Google Scholar 

  5. Callmeier, U.: PET – A platform for experimentation with efficient HPSG processing techniques. Natural Language Engineering 6(1), 99–108 (2000)

    Article  Google Scholar 

  6. Copestake, A., Flickinger, D.: An open-source grammar development environment and broad-coverage English grammar using HPSG. In: Proc. of LREC, Athens, Greece, pp. 591–598 (2000)

    Google Scholar 

  7. Oepen, S., Flickinger, D., Toutanova, K., Manning, C.D.: LinGO redwoods: A rich and dynamic treebank for HPSG. In: Proc. of the Workshop on Treebanks and Linguistic Theories, TLT 2002, Sozopol, Bulgaria, September 20–21 (2002)

    Google Scholar 

  8. Copestake, A., Flickinger, D., Sag, I.A., Pollard, C.: Minimal recursion semantics: an introduction. Journal of Research on Language and Computation 3(2–3) (2005)

    Google Scholar 

  9. Uszkoreit, H., Jörg, B., Erbach, G.: An ontology-based knowledge portal for language technology. In: Proc. of ENABLER/ELSNET Workshop, Paris (2003)

    Google Scholar 

  10. Schäfer, U.: OntoNERdIE – mapping and linking ontologies to named entity recognition and information extraction resources. In: Proc. of LREC, Genoa, Italy (2006)

    Google Scholar 

  11. Miller, G.A., Beckwith, R., Fellbaum, C., Gross, D., Miller, K.J.: Five papers on WordNet. Technical report, Cognitive Science Lab, Princeton University (1993)

    Google Scholar 

  12. Rupp, C., Copestake, A., Corbett, P., Waldron, B.: Integrating general-purpose and domain-specific components in the analysis of scientific text. In: Proc. of the UK e-Science Programme All Hands Meeting 2007, Nottingham, UK (2007)

    Google Scholar 

  13. Sætre, R., Kenji, S., Tsujii, J.: Syntactic features for protein-protein interaction extraction. In: Baker, C.J., Jian, S. (eds.) Short Paper Proc. of the 2nd International Symposium on Languages in Biology and Medicine (LBM 2007), Singapore (2008)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Andreas R. Dengel Karsten Berns Thomas M. Breuel Frank Bomarius Thomas R. Roth-Berghofer

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Schäfer, U., Uszkoreit, H., Federmann, C., Marek, T., Zhang, Y. (2008). Extracting and Querying Relations in Scientific Papers. In: Dengel, A.R., Berns, K., Breuel, T.M., Bomarius, F., Roth-Berghofer, T.R. (eds) KI 2008: Advances in Artificial Intelligence. KI 2008. Lecture Notes in Computer Science(), vol 5243. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-85845-4_16

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-85845-4_16

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-85844-7

  • Online ISBN: 978-3-540-85845-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics