Encyclopedia of Database Systems

2018 Edition
| Editors: Ling Liu, M. Tamer Özsu

Biomedical Scientific Textual Data Types and Processing

  • Li ZhouEmail author
  • Hua Xu
Reference work entry
DOI: https://doi.org/10.1007/978-1-4614-8265-9_495


Annotation; Biomedical literature; Curation; Indexing; Information retrieval; Information retrieval models/metrics/operations; MEDLINE/PubMed; Scientific knowledge bases; Semi-structured text retrieval; Text extraction; Text mining; Web search and crawling


Vast amounts of biomedical scientific information and knowledge are recorded in text [1, 7]. Various scientific textual data in the biomedical domain may generally be disseminated through the following resources [7, 11]: biomedical literature (e.g., original reports and summaries of research in journals, books, reports, and guidelines), biological databases (e.g., annotations in gene/protein databases), patient records (e.g., clinical narrative reports), and web content.

A variety of techniques have been applied to identify, extract, manage, integrate and exploit knowledge from biomedical text. Some researchers [ 11] divide biomedical scientific textual data processing into three major activities as shown in Fig. 1...
This is a preview of subscription content, log in to check access.

Recommended Reading

  1. 1.
    Chen H, Friedman W, Hersh SS, editors. Fuller medical informatics: knowledge management and data mining in biomedicine. Secaucus: Springer; 2005.Google Scholar
  2. 2.
    Cohen AM, Hersh WR. A survey of current work in biomedical text mining. Brief Bioinform. 2005;6(1):57–71.CrossRefGoogle Scholar
  3. 3.
    Donaldson I, Martin J, deBruijn B, Wolting C, Lay V, Tuekam B, Zhang S, Baskin B, Bader G, Michalickova K, et al. PreBIND and textomy – mining the biomedical literature for protein-protein interactions using a support vector machine. BMC Bioinf. 2003;4(1):11.CrossRefGoogle Scholar
  4. 4.
    Friedman C, Kra P, Yu H, Krauthammer M, Rzhetsky A. GENIES: a natural-language processing system for the extraction of molecular pathways from journal articles. Bioinformatics. 2001;17(Suppl 1):S74–82.CrossRefGoogle Scholar
  5. 5.
    Gaizauskas R, Demetriou G, Artymiuk PJ, Willett P. Protein structures and information extraction from biological texts: the PASTA system. Bioinformatics. 2003;19(1):135–43.CrossRefGoogle Scholar
  6. 6.
    Hearst M. Untangling text data mining. In: Proceedings of the 27th Annual Meeting of the Association for Computational Linguistics; 1999.Google Scholar
  7. 7.
    Hersh W. Information retrieval: a health and biomedical perspective. New York: Springer; 2003.Google Scholar
  8. 8.
    Hersh W, Cohen A, Roberts P, Rekapalli HK. TREC 2006 genomics track overview. In: Proceedings of the Text Retrieval Conference; 2006. Available at: http://trec.nist.gov/pubs/trec15/papers/GEO06. OVERVIEW.pdf
  9. 9.
    Hoffmann R, Valencia A. A gene network for navigating the literature. Nat Genet. 2004;36(7):664.CrossRefGoogle Scholar
  10. 10.
    Hristovski D, Peterlin B. Literature-based disease candidate gene discovery. In: Proceedings of the Medinfo; 2004. p. 1649.Google Scholar
  11. 11.
    Natarajan J, Berrar D, Hack CJ, Dubizky W. Knowledge discovery in biology and biotechnology texts: a review of techniques, evaluation strategies, and applications. Crit Rev Biotechnol. 2005;25(1–2):31–52.CrossRefGoogle Scholar
  12. 12.
    Smalheiser N, Swanson D. Using ARROWSMITH: a computer-assisted approach to formulating and assessing scientific hypotheses. Comput Methods Prog Biomed. 1998;57(3):149–53.CrossRefGoogle Scholar
  13. 13.
    Swanson DR Complementary structure in disjoint science literatures. In: Proceedings of the 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval; 1990. p. 280–9.Google Scholar
  14. 14.
    Yeh AS, Hirschman L, Morgan AA. Evaluation of text data mining for database curation: lessons learned from the KDD Challenge Cup. Bioinformatics. 2003;19(Suppl 1):i331–9.CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Partners HealthCare System Inc.BostonUSA
  2. 2.Columbia UniversityNew YorkUSA

Section editors and affiliations

  • Vipul Kashyap
    • 1
  1. 1.Director, Clinical ProgramsCIGNA HealthcareBloomfieldUSA