Abstract
In this paper, we will demonstrate syntactic analysis and visualization of scientific data, namely references from scientific papers. Our main goal is to build a parser which could extract references from scientific papers, convert them to XML format, send to custom visualization algorithm and present in a web interface as a ReferenceTree for a single author. For this process, we use several different technologies such as NLP software NooJ, programming languages PHP and JavaScript in combination with HTML5. Our main problem was dissimilarity in reference styles between articles. Thus, our parser was designed to recognize different reference source (book, paper, web page) in APA, MLA and Chicago reference styles. As for the visualization idea, we have chosen the concept of presenting an author as a tree, the publication years as the main branches, the articles/books as twigs and references used in each article/book as the leaves. The books are grouped on the left side of the tree while the articles are grouped on the right side. With final output, every processed author should have a unique tree (preferences of references) and could be compared with the rest of the scientific forest.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
NooJ can freely be downloaded from http://www.nooj4nlp.net/.
- 2.
- 3.
Due to the length and complexity of real and pseudo codes used, in this paper we are only giving the main steps while the visual demo and JavaScript source code are available at: http://www.ikstudenstkiprojekti.ffzg.hr/CitationTrees/exampleTree.php.
References
Sallaberry, A., Fu, Y.-C., Ho, H.-C., Ma, K.-L.: ContactTrees: a technique for studying personal network data. CoRR, abs/1411.0052 (2014)
Fung, T.-L., Ma, K.-L.: Visual characterization of personal bibliographic data using a botanical tree design. In: Electronic Proceedings of IEEE VIS 2015 Workshop on Personal Visualization: Exploring Data in Everyday Life (2015). http://www.vis4me.com/personalvis15/papers/fung.pdf
Fung, T.-L., Chou, J.-K., Ma, K.-L.: Comparing characteristics of majors using egocentric botanic-trees (2015). http://vacommunity.org/ieeevpg/viscontest/2015/entries/6.html
Sallaberry, A., Ma, K.-L.: Visualizing InfoVis Researchers with ContactTrees (2012). http://web.cse.ohio-state.edu/~raghu/teaching/CSE5544/Visweek2012/infovis/posters/sallaberry.pdf
Sallaberry, A., Fu, Y.-C., Ho, H.-C., Ma, K.-L.: Contact trees: network visualization beyond nodes and edges. PLoS ONE 11(1), e0146368 (2016). doi:10.1371/journal.pone.0146368
Chen, C., Dubin, R., Schultz, T.: Science mapping. In: Khosrow-Pour, M. (ed.) Encyclopedia of Information Science and Technology, 3rd edn. IGI Global (2014). doi:10.4018/978-1-4666-5888-2.ch410
Silberztein, M.: NooJ manual. http://www.nooj4nlp.net, 223 p. (2003)
Baranovskiy, D.: Raphaël -JavaScript Library, http://raphaeljs.com. Accessed 17 Jan 2016
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing AG
About this paper
Cite this paper
Požega, M., Poljak, D., Kocijan, K. (2016). Building Scholarly Data Forest. In: González-Beltrán, A., Osborne, F., Peroni, S. (eds) Semantics, Analytics, Visualization. Enhancing Scholarly Data. SAVE-SD 2016. Lecture Notes in Computer Science(), vol 9792. Springer, Cham. https://doi.org/10.1007/978-3-319-53637-8_8
Download citation
DOI: https://doi.org/10.1007/978-3-319-53637-8_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-53636-1
Online ISBN: 978-3-319-53637-8
eBook Packages: Computer ScienceComputer Science (R0)