Skip to main content

Information Extraction for Biomedical Literature Using Artificial Intelligence: A Comparative Study

  • Conference paper
  • First Online:
International Conference on Advanced Intelligent Systems for Sustainable Development (AI2SD’2023) (AI2SD 2023)

Abstract

Growing biomedical literature necessitates efficient and effective knowledge extraction methodologies. Automated information extraction (IE) techniques offer promising solutions. This article presents a comprehensive comparative study that evaluates the latest research publications focusing on extracting information such as drug interactions and diseases from Biomedical Literature. The study analyses the selected studies, compares their findings, and discusses IE techniques’ attributes. This investigation examines a comprehensive range of IE methods, including deep learning, machine learning, rule-based methods, and hybrid approaches. The performance of specific methods is evaluated using discerning metrics such as F1-score, precision, recall, and accuracy. Evidence suggests that deep learning methods achieved significant improvements in accuracy, while hybrid approaches demonstrated flexibility and robustness. Also, domain-specific models and pre-trained language models are emphasized to enhance contextual understanding. Despite progress, challenges persist, including the handling of complex sentences, data availability, and generalization. In the context of the rapidly evolving biomedical field, IE methods play an increasingly critical role in the development of medical knowledge and patient care. This study highlights the critical need for continued research to develop and refine IE techniques for their broader application.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 199.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Abbreviations

In :

this paper, the following abbreviations are used:

IE:

Information Extraction

NLP:

Natural Language Processing

ML:

Machine Learning

DL:

Deep Learning

SVM:

Support Vector Machine

CNN:

Convolutional Neural Network

McDepCNN:

Multichannel Dependency-based Convolutional Neural Network

LSTM:

Long Short-Term Memory

BiLSTM:

Bidirectional long-short term memory

DS-LSTM:

Deep-contextualized Stacked Bi-LSTM

RNN:

Recursive Neural Networks

BERT:

Bidirectional Encoder Representations from Transformers

BioBERT:

Biomedical BERT

R-BERT:

Relationship-BERT

ADVBERT:

Adversarial BERT

CRF:

Conditional Random Field

NER:

Named entity recognition

TF-IDF:

Term Frequency-Inverse Document Frequency

SRL:

Semantic Role Labelling

SDP:

Shortest dependency path

GRGT:

Grammatical Relationship Graph for Triplets

DSTK:

Distributed Smoothed Tree Kernel

ROUGE:

Recall-Oriented Understudy for Gisting Evaluation

MaxEnt:

Maximum Entropy

DDI:

Drug-Drug Interaction

PPI:

Protein-Protein Interaction

PLMs:

Pre-trained language models

MeSH:

Medical Subject Headings

CORD-19:

COVID-19 Open Research Dataset

IoT:

Internet of Things

EMFs:

Electromagnetic Fields

EESE:

EMF Exposure Source Extraction

DTI:

Drug Target Interaction

CPI:

Compound-Protein Interaction

EMF-Portal:

EMF-Scientific Literature Portal

SpiNet:

Domain-specific FrameNet

PICO:

Population, Intervention, Comparator, Outcome

RE:

Regular Expressions

References

  1. Benkassioui, B., Kharmoum, N., Hadi, M.Y., Ezziyyani, M.: NLP methods’ information extraction for textual data: an analytical study. In: International Conference on Advanced Intelligent Systems for Sustainable Development, pp. 515–527. Springer, Cham (2022)

    Google Scholar 

  2. Abbaoui, W., Retal, S., Kharmoum, N., Ziti, S.: Artificial intelligence at the service of precision medicine. In: International Conference on Advanced Intelligent Systems for Sustainable Development, pp. 91–103. Springer, Cham (2022)

    Google Scholar 

  3. Ennejjai, I., Ariss, A., Kharmoum, N., Rhalem, W., Ziti, S., Ezziyyani, M.: Artificial intelligence for fake news. In: International Conference on Advanced Intelligent Systems for Sustainable Development, pp. 77–91. Springer, Cham (2022)

    Google Scholar 

  4. Retal, S., Sahbani, H., Kharmoum, N., Rhalem, W., Ezziyyani, M.: Machine learning for diabetes prediction: a systematic review and a conceptual framework for early prediction. In: International Conference on Advanced Intelligent Systems for Sustainable Development, pp. 75–83. Springer, Cham (2022)

    Google Scholar 

  5. Rhalem, W., et al.: Digital technology und artificial intelligence facing COVID-19. In: International Conference on Advanced Intelligent Systems for Sustainable Development, pp. 1229–1240. Springer, Cham (2020)

    Google Scholar 

  6. Kharmoum, N., Rhalem, W., Retal, S., bouchti, K.E., Ziti, S.: Getting the UML’s behavior and interaction diagrams by extracting business rules through the data flow diagram. In: International Conference on Advanced Intelligent Systems for Sustainable Development, pp. 540–547. Springer, Cham (2020)

    Google Scholar 

  7. Gsim, J., et al.: Artificial intelligence for stroke prediction. In: International Conference on Advanced Intelligent Systems for Sustainable Development, pp. 359–367. Springer, Cham (2022)

    Google Scholar 

  8. Lim, S., Lee, K., Kang, J.: Drug drug interaction extraction from the literature using a recursive neural network. PloS One 13(1), e0190926 (2018)

    Google Scholar 

  9. Afriza, A., Muztahid, M.R., Annisa, Kusuma, W.A.: Information extraction of compound-protein interaction from scientific paper using machine learning. Int. J. Adv. Sci. Eng. Inform. Technol. 12(2), 550–556 (2022). https://doi.org/10.18517/ijaseit.12.2.13748

  10. Tang, Z., Guo, X., Bai, Z., Diao, L., Lu, S., Li, L.: A protein-protein interaction extraction approach based on large pre-trained language model and adversarial training. KSII Trans. Internet Inf. Syst. (TIIS) 16(3), 771–791 (2022)

    Google Scholar 

  11. Paraskevopoulos, S., Smeets, P., Tian, X., Medema, G.: Using artificial intelligence to extract information on pathogen characteristics from scientific publications. Int. J. Hyg. Environ. Health 245, 114018 (2022)

    Google Scholar 

  12. Choi, S.P.: Extraction of protein–protein interactions (PPIs) from the literature by deep convolutional neural networks with various feature embeddings. J. Inf. Sci. 44(1), 60–73 (2018)

    Google Scholar 

  13. Wen, A., Sun, X., Yu, K., Wu, Y., Zhang, J., Yuan, Z.: Drug-drug interaction extraction using pre-training model of enhanced entity information. In: 2020 International Conferences on Internet of Things (iThings) and IEEE Green Computing and Communications (GreenCom) and IEEE Cyber, Physical and Social Computing (CPSCom) and IEEE Smart Data (SmartData) and IEEE Congress on Cybermatics (Cybermatics), pp. 527–532. IEEE (2020)

    Google Scholar 

  14. Duan, B., Peng, J., Zhang, Y.: IMSE: interaction information attention and molecular structure based drug drug interaction extraction. BMC Bioinformatics 23(7), 1–16 (2022)

    Google Scholar 

  15. Quan, C., Luo, Z., Wang, S.: A hybrid deep learning model for protein–protein interactions extraction from biomedical literature. Appl. Sci. 10(8), 2690 (2020)

    Google Scholar 

  16. Sun, C., et al.: A deep learning approach with deep contextualized word representations for chemical–protein interaction extraction from biomedical literature. IEEE Access 7, 151034–151046 (2019)

    Google Scholar 

  17. Wang, Q., Liao, J., Lapata, M., Macleod, M.: PICO entity extraction for preclinical animal literature. Syst. Rev. 11(1), 1–12 (2022)

    Google Scholar 

  18. Yu, K., Lung, P.Y., Zhao, T., Zhao, P., Tseng, Y.Y., Zhang, J.: Automatic extraction of protein-protein interactions using grammatical relationship graph. BMC Med. Inform. Decis. Making 18, 35–43 (2018)

    Google Scholar 

  19. Peng, Y., Lu, Z.: Deep learning for extracting protein-protein interactions from biomedical literature (2017). arXiv preprint arXiv:1706.01556

  20. Murugesan, G., Abdulkadhar, S., Natarajan, J.: Distributed smoothed tree kernel for protein-protein interaction extraction from the biomedical literature. PLoS One 12(11), e0187379 (2017)

    Google Scholar 

  21. Ferreira, V.C., Pinheiro, V.: SpiNet-A FrameNet-like schema for automatic information extraction about spine from scientific papers. In: AMIA Annual Symposium Proceedings, vol. 2020, p. 452. American Medical Informatics Association (2020)

    Google Scholar 

  22. Jain, R., Bellaney, B., Jangid, P.: Information extraction from CORD-19 using hierarchical clustering and word bank. In: 2021 12th International Conference on Computing Communication and Networking Technologies (ICCCNT), pp. 1–5. IEEE (2021)

    Google Scholar 

  23. Raja, K.: Biomedical literature mining and its components. In: Biomedical Text Mining, pp. 1–16. Springer, New York (2022)

    Google Scholar 

  24. Lee, S.W., Kwon, J.H., Lee, B., Kim, E.J.: Scientific literature information extraction using text mining techniques for human health risk assessment of electromagnetic fields. Sens. Mater. 32 (2020)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bouchaib Benkassioui .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Benkassioui, B., Retal, S., Kharmoum, N., Hadi, M.Y., Rhalem, W. (2024). Information Extraction for Biomedical Literature Using Artificial Intelligence: A Comparative Study. In: Ezziyyani, M., Kacprzyk, J., Balas, V.E. (eds) International Conference on Advanced Intelligent Systems for Sustainable Development (AI2SD’2023). AI2SD 2023. Lecture Notes in Networks and Systems, vol 904. Springer, Cham. https://doi.org/10.1007/978-3-031-52388-5_6

Download citation

Publish with us

Policies and ethics