Abstract
Named entity recognition (NER) is an important component of many information extraction and linking pipelines. The task is especially challenging in a low-resource scenario, where there is very limited amount of high quality annotated data. In this paper we benchmark machine learning approaches for NER that may be very effective in such cases, and compare their performance in a novel application; information extraction of research infrastructure from scientific manuscripts. We explore approaches such as incorporating Contrastive Learning (CL), as well as Conditional Random Fields (CRF) weights in BERT-based architectures and demonstrate experimentally that such combinations are very efficient in few-shot learning set-ups, verifying similar findings that have been reported in other areas of NLP, as well as Computer Vision. More specifically, we show that the usage of CRF weights in BERT-based architectures achieves noteworthy improvements in the overall NER task by approximately 12%, and that in few-shot setups the effectiveness of CRF weights is much higher in smaller training sets.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
DOI: 10.17632/ty73wxgtpx.1.
References
European Research & Innovation ERIC practical guidelines legal framework for a European Research Infrastructure Consortium. Publications Office (2015)
Wolf, T., et al.: Huggingface’s transformers: state-of-the-art natural language processing. arXiv Preprint arXiv:1910.03771 (2019)
sklearn - CRFSuite. https://github.com/TeamHG-Memex/sklearn-crfsuite/
Paszke, A., Gross, S., et al.: PyTorch: An Imperative Style, High-Performance Deep Learning Library (2019)
Van Rossum, G., Drake, F.L.: Python 3 Reference Manual (2009)
Devlin, J., Chang, M., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. arXiv Preprint arXiv:1810.04805
Liaw, R., Liang, E., Nishihara, R., Moritz, P., Gonzalez, J., Stoica, I.T.: A Research Platform for Distributed Model Selection and Training (2018)
Loper, E., Bird, S.: NLTK: The Natural Language Toolkit. CoRR cs.CL/0205028 (2002). http://dblp.uni-trier.de/db/journals/corr/corr0205.html#cs-CL-0205028
Honnibal, M., Montani, I.: spaCy 2: Natural language understanding with Bloom embeddings, convolutional neural networks and incremental parsing (2017)
Lee, H., Peirsman, Y., Chang, A., Chambers, N., Surdeanu, M., Jurafsky, D.: Stanford’s multi-pass sieve coreference resolution system at the CoNLL-2011 shared task (2011)
Wang, Z., Zhao, K., Wang, Z., Shang, J.: Formulating few-shot fine-tuning towards language model pre-training: a pilot study on named entity recognition. arXiv preprint arXiv:2205.11799 (2022)
Yang, Y., Katiyar, A.: Simple and effective few-shot named entity recognition with structured nearest neighbor learning (2020)
Wang, Y., Yao, Q., Kwok, J.T., Ni, L.M.: Generalizing from a few examples: a survey on few-shot learning. ACM Comput. Surv. 53(3), 34, Article no. 63 (2021). https://doi.org/10.1145/3386252
Clinical information extraction applications: a literature review. J. Biomed. Inform. 77 (2018)
Snow, R., O’connor, B., Jurafsky, D., Ng, A.Y.: Cheap and fast-but is it good? Evaluating non-expert annotations for natural language tasks (2008)
Fleiss, J.L.: Measuring nominal scale agreement among many raters. Psychol. Bull. 76 (1971)
Banerjee, M., Capozzoli, M., McSweeney, L., Sinha, D.: Beyond kappa: a review of interrater agreement measures. Can. J. Stat. 27 (1999)
Hadsell, R., Chopra, S., LeCun, Y.: Dimensionality reduction by learning an invariant mapping. In: 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2006), vol. 2. IEEE (2006)
Gunel, B., Du, J., Conneau, A., Stoyanov, V.: Supervised contrastive learning for pre-trained language model fine-tuning (2020)
Cao, K., Wei, C., Gaidon, A., Arechiga, N., Ma, T.: Learning imbalanced datasets with label-distribution-aware margin loss. In: Advances in Neural Information Processing Systems, vol. 32 (2019)
Liu, W., Wen, Y., Yu, Z., Yang, M.: Large-margin softmax loss for convolutional neural networks. In: ICML, vol. 2 (2016)
Das, S.S.S., Katiyar, A., Passonneau, R.J., Zhang, R.: CONTaiNER: few-shot named entity recognition via contrastive learning (2021). https://github.com/psunlpgroup/CONTaiNER
Kayal, S., et al.: Tagging funding agencies and grants in scientific articles using sequential learning models. In: BioNLP 2017, pp. 216–221 (2017)
Li, J., Sun, A., Han, J., Li, C.: A survey on deep learning for NER. IEEE Trans. Knowl. Data Eng. 34, 50–70 (2022)
Yadav, V., Bethard, S.: A survey on recent advances in named entity recognition from deep learning models (2019)
Lafferty, J., McCallum, A., Pereira, F.C.N.: Conditional random fields: probabilistic models for segmenting and labeling sequence data (2001)
Ma, X., Hovy, E.: End-to-end sequence labeling via bi-directional LSTM-CNNs-CRF. arXiv preprint arXiv:1603.01354 (2016)
Ramshaw, L.A., Marcus, M.P.: Text chunking using transformation-based learning. In: Armstrong, S., Church, K., Isabelle, P., Manzi, S., Tzoukermann, E., Yarowsky, D. (eds.) TLTB. Text, Speech and Language Technology, vol. 11, pp. 157–179. Springer, Dordrecht (1999). https://doi.org/10.1007/978-94-017-2390-9_10
Bird, S., Klein, E., Loper, E.: Natural Language Processing with Python: Analyzing Text with the Natural Language Toolkit. O’Reilly Media Inc. (2009)
Dror, R.B., et al.: The Hitchhiker’s guide to testing statistical significance in natural language processing. In: Proceedings of the 56th Annual Meeting of the ACL (2018)
Tabatabaei, S.A., et al.: Annotating research infrastructure in scientific papers: an NLP-driven approach. In: Proceedings of the 61st Annual Meeting of the ACL (Volume 5: Industry Track), pp. 457–463 (2023)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Cheirmpos, G., Tabatabaei, S.A., Kanoulas, E., Tsatsaronis, G. (2024). Benchmarking Named Entity Recognition Approaches for Extracting Research Infrastructure Information from Text. In: Nicosia, G., Ojha, V., La Malfa, E., La Malfa, G., Pardalos, P.M., Umeton, R. (eds) Machine Learning, Optimization, and Data Science. LOD 2023. Lecture Notes in Computer Science, vol 14505. Springer, Cham. https://doi.org/10.1007/978-3-031-53969-5_11
Download citation
DOI: https://doi.org/10.1007/978-3-031-53969-5_11
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-53968-8
Online ISBN: 978-3-031-53969-5
eBook Packages: Computer ScienceComputer Science (R0)