Skip to main content

Mining the Bibliome

  • Chapter
  • First Online:
Translational Informatics

Part of the book series: Health Informatics ((HI))

  • 1158 Accesses

Abstract

Biomedical literature offers a systematic catalogue of interpretations about data that can be used to infer new knowledge. The analysis of this literature (referred to in this chapter as the “bibliome”) in light of the exponential growth of biomedical data necessitates methodologies to transform data into knowledge. Such techniques are wrought with challenges, but offer some promise for transforming the big data deluge into novel hypotheses that can lead to new knowledge. This chapter provides an overview of the knowledge discovery process in the context of biomedical literature, and explains how such a process (referred to as “bibliome mining”) can be seen as an integral part of a learning healthcare system.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 54.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Ackoff R. From data to wisdom. J Appl Syst Anal. 1989;16:3–9.

    Google Scholar 

  2. Sarkar I. Methods in biomedical informatics: a pragmatic approach. Boston: Academic; 2013.

    Google Scholar 

  3. Danciu I, Cowan JD, Basford M, Wang X, Saip A, Osgood S, et al. Secondary use of clinical data: the Vanderbilt approach. J Biomed Inform. 2014.

    Google Scholar 

  4. Prokosch HU, Ganslandt T. Perspectives for medical informatics. Reusing the electronic medical record for clinical research. Methods Inf Med. 2009;48(1):38–44.

    CAS  PubMed  Google Scholar 

  5. Sharing clinical research data: workshop summary. The National Academies Collection: Reports funded by National Institutes of Health. Washington, DC; 2013.

    Google Scholar 

  6. Bodenreider O. The Unified Medical Language System (UMLS): integrating biomedical terminology. Nucleic Acids Res. 2004;32(Database issue):D267–70.

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  7. Whetzel PL, Noy NF, Shah NH, Alexander PR, Nyulas C, Tudorache T, et al. BioPortal: enhanced functionality via new Web services from the National Center for Biomedical Ontology to access and use ontologies in software applications. Nucleic Acids Res. 2011;39(Web Server issue):W541–5.

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  8. Wiesenauer M, Johner C, Rohrig R. Secondary use of clinical data in healthcare providers – an overview on research, regulatory and ethical requirements. Stud Health Technol Inform. 2012;180:614–8.

    PubMed  Google Scholar 

  9. Collen MF. Computer medical databases: the first six decades (1950–2010). London/New York: Springer; 2012. xix, 288 p.

    Google Scholar 

  10. Grivell L. Mining the bibliome: searching for a needle in a haystack? New computing tools are needed to effectively scan the growing amount of scientific literature for useful information. EMBO Rep. 2002;3(3):200–3.

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  11. The tree of life blog by Jonathan Eisen [Mar 6, 2014]. Available from: http://phylogenomics.blogspot.com/2010/03/bibliome-wikipedia-free-encyclopedia.html.

  12. Scientific data [Mar 6, 2014]. Available from: http://www.nature.com/scientificdata/.

  13. Muller H, Michoux N, Bandon D, Geissbuhler A. A review of content-based image retrieval systems in medical applications-clinical benefits and future directions. Int J Med Inform. 2004;73(1):1–23.

    Article  PubMed  Google Scholar 

  14. Lam HY, Marenco L, Clark T, Gao Y, Kinoshita J, Shepherd G, et al. AlzPharm: integration of neurodegeneration data using RDF. BMC Bioinforma. 2007;8 Suppl 3:S4.

    Article  Google Scholar 

  15. Sandor A, de Waard A. Identifying claimed knowledge updates in biomedical research articles. Proceedings of the Workshop on Detecting Structure in Scholarly Discourse, Jeju Island, Korea. 2012. p. 10–7.

    Google Scholar 

  16. Ciccarese P, Wu E, Wong G, Ocana M, Kinoshita J, Ruttenberg A, et al. The SWAN biomedical discourse ontology. J Biomed Inform. 2008;41(5):739–51.

    Article  PubMed  Google Scholar 

  17. Beck J. NISO Z39.96 The Journal Article Tag Suite (JATS): what happened to the NLM DTDs? J Electron Publ. 2011;14(1). http://dx.doi.org/10.3998/3336451.0014.106

  18. Cohen KB, Demner-Fushman D. Biomedical natural language processing. Amsterdam: John Benjamins Publishing Company; 2013. pages cm. p.

    Google Scholar 

  19. Ferucci D, Brown E, Chu-Carroll J, Fan J, Gondek D, Kalyanpur AA, et al. Building Watson: an overview of the DeepQA Project. AI Mag. 2010;31(3):59–79.

    Google Scholar 

  20. Aronson AR, Lang FM. An overview of MetaMap: historical perspective and recent advances. J Am Med Inform Assoc. 2010;17(3):229–36.

    PubMed Central  PubMed  Google Scholar 

  21. Jonquet C, Shah NH, Musen MA. The open biomedical annotator. Summit Transl Bioinforma. 2009;2009:56–60.

    Google Scholar 

  22. Cimino JJ. Infobuttons: anticipatory passive decision support. AMIA Annu Symp Proc. 2008:1203–4

    Google Scholar 

  23. Friedman C. A broad-coverage natural language processing system. AMIA Annu Symp Proc. 2000:270–4.

    Google Scholar 

  24. Dublin S, Baldwin E, Walker RL, Christensen LM, Haug PJ, Jackson ML, et al. Natural language processing to identify pneumonia from radiology reports. Pharmacoepidemiol Drug Saf. 2013;22(8):834–41.

    Article  PubMed  Google Scholar 

  25. Christensen LM, Haug PJ, Fiszman M. MPLUS: a probabilistic medical language understanding system. In: Proceedings of the workshop on Natural Language Processing in the Biomedical Domain, Philadelphia, PA. 2002. p. 29–36.

    Google Scholar 

  26. Hahn U, Romacker M, Schulz S. MEDSYNDIKATE–a natural language system for the extraction of medical information from findings reports. Int J Med Inform. 2002;67(1–3):63–74.

    Article  PubMed  Google Scholar 

  27. D’Avolio LW, Nguyen TM, Farwell WR, Chen Y, Fitzmeyer F, Harris OM, et al. Evaluation of a generalizable approach to clinical information retrieval using the automated retrieval console (ARC). J Am Med Inform Assoc. 2010;17(4):375–82.

    Article  PubMed Central  PubMed  Google Scholar 

  28. Cunningham H, Maynard D, Bontcheva K, Tablan V. GATE: an architecture for development of Robust HLT applications. In: ACL ‘02 Proceedings of the 40th annual meeting on Association for Computational Linguistics, Stroudsburg, PA; 2002. p. 168–75.

    Google Scholar 

  29. Ferrucci D, Lally A. UIMA: an architectural approach to unstructured information processing in the corporate research environment. Nat Lang Eng. 2004;10(3–4):327–48.

    Article  Google Scholar 

  30. Athenikos SJ, Han H. Biomedical question answering: a survey. Comput Methods Programs Biomed. 2010;99(1):1–24.

    Article  PubMed  Google Scholar 

  31. WolframAlpha [Mar 6, 2014]. Available from: http://www.wolframalpha.com/.

  32. Aronson AR, Mork JG, Gay CW, Humphrey SM, Rogers WJ. The NLM indexing initiative’s medical text indexer. Stud Health Technol Inform. 2004;107(Pt 1):268–72.

    PubMed  Google Scholar 

  33. Weibel S. The Dublin core: a simple content description model for electronic resources. Bull Am Soc Inf Sci Technol. 1997;24(1):9–11.

    Article  Google Scholar 

  34. Rocca-Serra P, Brandizi M, Maguire E, Sklyar N, Taylor C, Begley K, et al. ISA software suite: supporting standards-compliant experimental annotation and enabling curation at the community level. Bioinformatics. 2010;26(18):2354–6.

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  35. Swanson DR. Fish oil, Raynaud’s syndrome, and undiscovered public knowledge. Perspect Biol Med. 1986;30(1):7–18.

    CAS  PubMed  Google Scholar 

  36. DiGiacomo RA, Kremer JM, Shah DM. Fish-oil dietary supplementation in patients with Raynaud’s phenomenon: a double-blind, controlled, prospective study. Am J Med. 1989;86(2):158–64.

    Article  CAS  PubMed  Google Scholar 

  37. Smalheiser NR, Swanson DR. Using ARROWSMITH: a computer-assisted approach to formulating and assessing scientific hypotheses. Comput Methods Programs Biomed. 1998;57(3):149–53.

    Article  CAS  PubMed  Google Scholar 

  38. Arrowsmith [Mar 6, 2014]. Available from: http://arrowsmith.psych.uic.edu/arrowsmith_uic/.

  39. Salton G, McGill MJ. Introduction to modern information retrieval. New York: McGraw-Hill; 1983. xv, 448 p.

    Google Scholar 

  40. Sarkar IN. A vector space model approach to identify genetically related diseases. J Am Med Inform Assoc. 2012;19(2):249–54.

    Article  PubMed Central  PubMed  Google Scholar 

  41. Sharma V, Sarkar IN. Leveraging concept-based approaches to identify potential phyto-therapies. J Biomed Inform. 2013;46(4):602–14.

    Article  PubMed Central  PubMed  Google Scholar 

  42. Carletta J. Assessing agreement on classification tasks: the Kappa statistic. Comput Linguis. 1996;22(2):249–54.

    Google Scholar 

  43. Fleiss JL. Measuring nominal scale agreement among many raters. Psychol Bull. 1971;76(5):378–82.

    Article  Google Scholar 

  44. Kwon SW. Surviving in the era of “Big Data”. Blood Res. 2013;48(3):167–8.

    Article  PubMed Central  PubMed  Google Scholar 

  45. Howe D, Costanzo M, Fey P, Gojobori T, Hannick L, Hide W, et al. Big data: the future of biocuration. Nature. 2008;455(7209):47–50.

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  46. Baldwin G. Small fish, big data pond. Health Data Manag. 2009;17(9):48.

    PubMed  Google Scholar 

  47. Fitbit [Mar 6, 2014]. Available from: https://www.fitbit.com/.

  48. 23andMe [Mar 6, 2014]. Available from: https://www.23andme.com/.

Additional Reading

  • Collen MF. Computer medical databases: the first six decades (1950–2010). London: Springer; 2012. xix, 288 p.

    Google Scholar 

  • Danciu I, Cowan JD, Basford M, Wang X, Saip A, Osgood S, et al. Secondary use of clinical data: the Vanderbilt approach. J Biomed Inform. 2014. (in press) http://dx.doi.org/10.1016/j.jbi.2014.02.003

  • Friedman C. A broad-coverage natural language processing system. Proc AMIA Symp. 2000:270–4.

    Google Scholar 

  • Grivell L. Mining the bibliome: searching for a needle in a haystack? New computing tools are needed to effectively scan the growing amount of scientific literature for useful information. EMBO Rep. 2002;3(3):200–3.

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  • Howe D, Costanzo M, Fey P, Gojobori T, Hannick L, Hide W, et al. Big data: the future of biocuration. Nature. 2008;455(7209):47–50.

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  • Prokosch HU, Ganslandt T. Perspectives for medical informatics. Reusing the electronic medical record for clinical research. Methods Inf Med. 2009;48(1):38–44.

    CAS  PubMed  Google Scholar 

  • Salton G, McGill MJ. Introduction to modern information retrieval. New York: McGraw-Hill; 1983. xv, 448 p.

    Google Scholar 

  • Sarkar I. Methods in biomedical informatics: a pragmatic approach. Boston: Academic; 2013.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Indra Neil Sarkar PhD, MLIS .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer-Verlag London

About this chapter

Cite this chapter

Sarkar, I.N. (2015). Mining the Bibliome. In: Payne, P., Embi, P. (eds) Translational Informatics. Health Informatics. Springer, London. https://doi.org/10.1007/978-1-4471-4646-9_5

Download citation

  • DOI: https://doi.org/10.1007/978-1-4471-4646-9_5

  • Published:

  • Publisher Name: Springer, London

  • Print ISBN: 978-1-4471-4645-2

  • Online ISBN: 978-1-4471-4646-9

  • eBook Packages: MedicineMedicine (R0)

Publish with us

Policies and ethics