Abstract
In this paper, three different possible inputs (reference strings, reference segments and a combination of reference strings and segments) were tested to find the best performing strategy for citation matching. Our evaluation on a manually curated gold standard showed that the input data consisting of the combination of reference segments and reference strings lead to the best result. In addition, the usage of the probabilities of the segmentation improve the result when only features based on reference segments are considered.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Boukhers, Z., et al.: An end-to-end approach for extracting and segmenting high-variance references from PDF documents. In: Proceedings of JCDL 2019. ACM (2019). https://doi.org/10.1109/JCDL.2019.00035
Christen, P.: Data Matching: Concepts and Techniques for Record Linkage, Entity Resolution, and Duplicate Detection. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-31164-2
Ghavimi, B., Otto, W., Mayr, P.: EXmatcher: combining features based on reference strings and segments to enhance citation matching. arXiv preprint (2019). arXiv:1906.04484
Hienert, D., et al.: Digital library research in action — supporting information retrieval in sowiport. D-Lib Mag. 21(3/4) (2015). https://doi.org/10.1045/march2015-hienert
Hosseini, A., et al.: EXCITE - a toolchain to extract, match and publish open literature references. In: Proceedings of JCDL 2019 (2019). https://doi.org/10.1109/JCDL.2019.00105
Koo, H.K., et al.: Effects of unpopular citation fields in citation matching performance. In: Proceedings of ICISA 2011 (2011)
Moed, H.F.: Citation Analysis in Research Evaluation, vol. 9. Springer, New York (2006). https://doi.org/10.1007/1-4020-3714-7
Wellner, B., et al.: An integrated, conditional model of information extraction and coreference with application to citation matching. AUAI Press (2004)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Ghavimi, B., Otto, W., Mayr, P. (2019). An Evaluation of the Effect of Reference Strings and Segmentation on Citation Matching. In: Doucet, A., Isaac, A., Golub, K., Aalberg, T., Jatowt, A. (eds) Digital Libraries for Open Knowledge. TPDL 2019. Lecture Notes in Computer Science(), vol 11799. Springer, Cham. https://doi.org/10.1007/978-3-030-30760-8_35
Download citation
DOI: https://doi.org/10.1007/978-3-030-30760-8_35
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-30759-2
Online ISBN: 978-3-030-30760-8
eBook Packages: Computer ScienceComputer Science (R0)