Abstract
Growing biomedical literature necessitates efficient and effective knowledge extraction methodologies. Automated information extraction (IE) techniques offer promising solutions. This article presents a comprehensive comparative study that evaluates the latest research publications focusing on extracting information such as drug interactions and diseases from Biomedical Literature. The study analyses the selected studies, compares their findings, and discusses IE techniques’ attributes. This investigation examines a comprehensive range of IE methods, including deep learning, machine learning, rule-based methods, and hybrid approaches. The performance of specific methods is evaluated using discerning metrics such as F1-score, precision, recall, and accuracy. Evidence suggests that deep learning methods achieved significant improvements in accuracy, while hybrid approaches demonstrated flexibility and robustness. Also, domain-specific models and pre-trained language models are emphasized to enhance contextual understanding. Despite progress, challenges persist, including the handling of complex sentences, data availability, and generalization. In the context of the rapidly evolving biomedical field, IE methods play an increasingly critical role in the development of medical knowledge and patient care. This study highlights the critical need for continued research to develop and refine IE techniques for their broader application.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Abbreviations
- In :
-
this paper, the following abbreviations are used:
- IE:
-
Information Extraction
- NLP:
-
Natural Language Processing
- ML:
-
Machine Learning
- DL:
-
Deep Learning
- SVM:
-
Support Vector Machine
- CNN:
-
Convolutional Neural Network
- McDepCNN:
-
Multichannel Dependency-based Convolutional Neural Network
- LSTM:
-
Long Short-Term Memory
- BiLSTM:
-
Bidirectional long-short term memory
- DS-LSTM:
-
Deep-contextualized Stacked Bi-LSTM
- RNN:
-
Recursive Neural Networks
- BERT:
-
Bidirectional Encoder Representations from Transformers
- BioBERT:
-
Biomedical BERT
- R-BERT:
-
Relationship-BERT
- ADVBERT:
-
Adversarial BERT
- CRF:
-
Conditional Random Field
- NER:
-
Named entity recognition
- TF-IDF:
-
Term Frequency-Inverse Document Frequency
- SRL:
-
Semantic Role Labelling
- SDP:
-
Shortest dependency path
- GRGT:
-
Grammatical Relationship Graph for Triplets
- DSTK:
-
Distributed Smoothed Tree Kernel
- ROUGE:
-
Recall-Oriented Understudy for Gisting Evaluation
- MaxEnt:
-
Maximum Entropy
- DDI:
-
Drug-Drug Interaction
- PPI:
-
Protein-Protein Interaction
- PLMs:
-
Pre-trained language models
- MeSH:
-
Medical Subject Headings
- CORD-19:
-
COVID-19 Open Research Dataset
- IoT:
-
Internet of Things
- EMFs:
-
Electromagnetic Fields
- EESE:
-
EMF Exposure Source Extraction
- DTI:
-
Drug Target Interaction
- CPI:
-
Compound-Protein Interaction
- EMF-Portal:
-
EMF-Scientific Literature Portal
- SpiNet:
-
Domain-specific FrameNet
- PICO:
-
Population, Intervention, Comparator, Outcome
- RE:
-
Regular Expressions
References
Benkassioui, B., Kharmoum, N., Hadi, M.Y., Ezziyyani, M.: NLP methods’ information extraction for textual data: an analytical study. In: International Conference on Advanced Intelligent Systems for Sustainable Development, pp. 515–527. Springer, Cham (2022)
Abbaoui, W., Retal, S., Kharmoum, N., Ziti, S.: Artificial intelligence at the service of precision medicine. In: International Conference on Advanced Intelligent Systems for Sustainable Development, pp. 91–103. Springer, Cham (2022)
Ennejjai, I., Ariss, A., Kharmoum, N., Rhalem, W., Ziti, S., Ezziyyani, M.: Artificial intelligence for fake news. In: International Conference on Advanced Intelligent Systems for Sustainable Development, pp. 77–91. Springer, Cham (2022)
Retal, S., Sahbani, H., Kharmoum, N., Rhalem, W., Ezziyyani, M.: Machine learning for diabetes prediction: a systematic review and a conceptual framework for early prediction. In: International Conference on Advanced Intelligent Systems for Sustainable Development, pp. 75–83. Springer, Cham (2022)
Rhalem, W., et al.: Digital technology und artificial intelligence facing COVID-19. In: International Conference on Advanced Intelligent Systems for Sustainable Development, pp. 1229–1240. Springer, Cham (2020)
Kharmoum, N., Rhalem, W., Retal, S., bouchti, K.E., Ziti, S.: Getting the UML’s behavior and interaction diagrams by extracting business rules through the data flow diagram. In: International Conference on Advanced Intelligent Systems for Sustainable Development, pp. 540–547. Springer, Cham (2020)
Gsim, J., et al.: Artificial intelligence for stroke prediction. In: International Conference on Advanced Intelligent Systems for Sustainable Development, pp. 359–367. Springer, Cham (2022)
Lim, S., Lee, K., Kang, J.: Drug drug interaction extraction from the literature using a recursive neural network. PloS One 13(1), e0190926 (2018)
Afriza, A., Muztahid, M.R., Annisa, Kusuma, W.A.: Information extraction of compound-protein interaction from scientific paper using machine learning. Int. J. Adv. Sci. Eng. Inform. Technol. 12(2), 550–556 (2022). https://doi.org/10.18517/ijaseit.12.2.13748
Tang, Z., Guo, X., Bai, Z., Diao, L., Lu, S., Li, L.: A protein-protein interaction extraction approach based on large pre-trained language model and adversarial training. KSII Trans. Internet Inf. Syst. (TIIS) 16(3), 771–791 (2022)
Paraskevopoulos, S., Smeets, P., Tian, X., Medema, G.: Using artificial intelligence to extract information on pathogen characteristics from scientific publications. Int. J. Hyg. Environ. Health 245, 114018 (2022)
Choi, S.P.: Extraction of protein–protein interactions (PPIs) from the literature by deep convolutional neural networks with various feature embeddings. J. Inf. Sci. 44(1), 60–73 (2018)
Wen, A., Sun, X., Yu, K., Wu, Y., Zhang, J., Yuan, Z.: Drug-drug interaction extraction using pre-training model of enhanced entity information. In: 2020 International Conferences on Internet of Things (iThings) and IEEE Green Computing and Communications (GreenCom) and IEEE Cyber, Physical and Social Computing (CPSCom) and IEEE Smart Data (SmartData) and IEEE Congress on Cybermatics (Cybermatics), pp. 527–532. IEEE (2020)
Duan, B., Peng, J., Zhang, Y.: IMSE: interaction information attention and molecular structure based drug drug interaction extraction. BMC Bioinformatics 23(7), 1–16 (2022)
Quan, C., Luo, Z., Wang, S.: A hybrid deep learning model for protein–protein interactions extraction from biomedical literature. Appl. Sci. 10(8), 2690 (2020)
Sun, C., et al.: A deep learning approach with deep contextualized word representations for chemical–protein interaction extraction from biomedical literature. IEEE Access 7, 151034–151046 (2019)
Wang, Q., Liao, J., Lapata, M., Macleod, M.: PICO entity extraction for preclinical animal literature. Syst. Rev. 11(1), 1–12 (2022)
Yu, K., Lung, P.Y., Zhao, T., Zhao, P., Tseng, Y.Y., Zhang, J.: Automatic extraction of protein-protein interactions using grammatical relationship graph. BMC Med. Inform. Decis. Making 18, 35–43 (2018)
Peng, Y., Lu, Z.: Deep learning for extracting protein-protein interactions from biomedical literature (2017). arXiv preprint arXiv:1706.01556
Murugesan, G., Abdulkadhar, S., Natarajan, J.: Distributed smoothed tree kernel for protein-protein interaction extraction from the biomedical literature. PLoS One 12(11), e0187379 (2017)
Ferreira, V.C., Pinheiro, V.: SpiNet-A FrameNet-like schema for automatic information extraction about spine from scientific papers. In: AMIA Annual Symposium Proceedings, vol. 2020, p. 452. American Medical Informatics Association (2020)
Jain, R., Bellaney, B., Jangid, P.: Information extraction from CORD-19 using hierarchical clustering and word bank. In: 2021 12th International Conference on Computing Communication and Networking Technologies (ICCCNT), pp. 1–5. IEEE (2021)
Raja, K.: Biomedical literature mining and its components. In: Biomedical Text Mining, pp. 1–16. Springer, New York (2022)
Lee, S.W., Kwon, J.H., Lee, B., Kim, E.J.: Scientific literature information extraction using text mining techniques for human health risk assessment of electromagnetic fields. Sens. Mater. 32 (2020)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Benkassioui, B., Retal, S., Kharmoum, N., Hadi, M.Y., Rhalem, W. (2024). Information Extraction for Biomedical Literature Using Artificial Intelligence: A Comparative Study. In: Ezziyyani, M., Kacprzyk, J., Balas, V.E. (eds) International Conference on Advanced Intelligent Systems for Sustainable Development (AI2SD’2023). AI2SD 2023. Lecture Notes in Networks and Systems, vol 904. Springer, Cham. https://doi.org/10.1007/978-3-031-52388-5_6
Download citation
DOI: https://doi.org/10.1007/978-3-031-52388-5_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-52387-8
Online ISBN: 978-3-031-52388-5
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)