Abstract
We present an approach to text mining in areas where the entities of interest can not be defined in advance. Our system is aimed at finding related events in natural science literature, in particular, changing/increasing/decreasing variables in Marine science publications. It enables semantic search for events by abstracting from morphological, lexical-semantic and syntactic variations. In addition, generalisation of variables through syntactic pruning helps finding similar variables. Relations between events are induced from co-occurrence frequencies. Extracted information is stored in a property graph database and accessed using the Cypher query language. A user interface presents events as a graph to visualise their type, frequency and relation strength, in combination with their textual sources.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
- 2.
- 3.
Graph rendering with vis.js Javascript library: http://visjs.org/.
References
Ananiadou, S., Mcnaught, J.: Text Mining for Biology And Biomedicine. Artech House Inc., Norwood (2005)
Cohen, K.B., Hunter, L.: Getting started in text mining. PLoS Comput. Biol. 4(1), e20 (2008)
Levy, R., Andrew, G.: Tregex and Tsurgeon: tools for querying and manipulating tree data structures. In: Proceedings of ELREC, pp. 2231–2234 (2006)
Manning, C.D., Surdeanu, M., Bauer, J., Finkel, J., Bethard, S.J., McClosky, D.: The stanford CoreNLP natural language processing toolkit. In: Proceedings of ACL, pp. 55–60 (2014)
Marsi, E., Öztürk, P.: Extraction and generalisation of variables from scientific publications. In: Proceedings of EMNLP, pp. 505–511, Lisbon, Portugal (2015)
Marsi, E., Öztürk, P., Aamot, E., Sizov, G., Ardelan, M.V.: Towards text mining in climate science: extraction of quantitative variables and their relations. In: Proceedings of Fourth Workshop on Building and Evaluating Resources for Health and Biomedical Text Processing, Reykjavik, Iceland (2014)
Radom, M., Rybarczyk, A., Kottmann, R., Formanowicz, P., Szachniuk, M., Glöckner, F.O., Rebholz-Schuhmann, D., Błażewicz, J.: Poseidon: an information retrieval and extraction system for metagenomic marine science. Ecol. Inf. 12, 10–15 (2012)
Swanson, D.R.: Fish oil, Raynaud’s syndrome, and undiscovered public knowledge. Perspect. Biol. Med. 30(1), 7–18 (1986)
Acknowledgements
Financial aid from the European Commission (OCEAN-CERTAIN, FP7-ENV-2013-6.1-1; no: 603773) is gratefully acknowledged. We thank Murat Van Ardelan for sharing his knowledge of Marine science.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing AG
About this paper
Cite this paper
Marsi, E., Øzturk, P. (2016). Text Mining of Related Events from Natural Science Literature. In: González-Beltrán, A., Osborne, F., Peroni, S. (eds) Semantics, Analytics, Visualization. Enhancing Scholarly Data. SAVE-SD 2016. Lecture Notes in Computer Science(), vol 9792. Springer, Cham. https://doi.org/10.1007/978-3-319-53637-8_4
Download citation
DOI: https://doi.org/10.1007/978-3-319-53637-8_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-53636-1
Online ISBN: 978-3-319-53637-8
eBook Packages: Computer ScienceComputer Science (R0)