Skip to main content

BioMedical Information Retrieval: The BioTracer Approach

  • Conference paper
Information Technology in Bio- and Medical Informatics, ITBAM 2010 (ITBAM 2010)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6266))

Abstract

With the large amount of biomedical information available today, providing a good search tool is vital. Such a tool should not only be able to retrieve the sought information, but also to filter out irrelevant documents, while giving the relevant ones the highest ranking. Focusing on biomedical information, the main goal of this work has been to investigate how to improve the ability for a system to find and rank relevant documents. To achieve this, we apply a series of information retrieval techniques to search in biomedical information and combine them in an optimal manner. These techniques include extending and using well-established information retrieval (IR) similarity models like the Vector Space Model (VSM) and BM25 and their underlying scoring schemes, and allowing users to affect the ranking according to their view of relevance. The techniques have been implemented and tested in a proof-of-concept prototype called BioTracer, extending a Java-based open source search engine library. The results from our experiments using the TREC 2004 Genomic Track collection seem promising. Our investigation have also revealed that involving the user in the search will indeed have positive effects on the ranking of search results, and that the approaches used in BioTracer can be used to meet the user’s information needs.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Abdou, S., Savoy, J.: Searching in Medline: Query expansion and manual indexing evaluation. Information Processing & Management 44(2), 781–789 (2008)

    Article  Google Scholar 

  2. Amati, G., Rijsbergen, C.J.V.: Probabilistic models of information retrieval based on measuring the divergence from randomness. ACM Transactions on Information Systems 20(4), 357–389 (2002)

    Article  Google Scholar 

  3. Baeza-Yates, R.A., Ribeiro-Neto, B.: Modern Information Retrieval. Addison-Wesley Longman Publishing Co., Inc., Boston (1999)

    Google Scholar 

  4. Chen, L., Liu, H., Friedman, C.: Gene name ambiguity of eukaryotic nomenclatures. Bioinformatics 21(2), 248–256 (2005)

    Article  Google Scholar 

  5. Croft, B., Metzler, D., Strohman, T.: Search Engines: Information Retrieval in Practice, 1st edn. Addison-Wesley, Reading (February 2009)

    Google Scholar 

  6. Divoli, A., Attwood, T.K.: BioIE: extracting informative sentences from the biomedical literature. Bioinformatics 21, 2138–2139 (2005)

    Article  Google Scholar 

  7. Eaton, A.D.: Hubmed: a web-based biomedical literature search interface. Nucleic Acids Research 34(Web Server issue), W745–W747 (2006)

    Google Scholar 

  8. Hatcher, E., Gospodnetic, O.: Lucene in Action. Manning Publications Co., Greenwich (2005)

    Google Scholar 

  9. Hersh, W.R., Bhupatiraju, R.T., Ross, L., Roberts, P., Cohen, A.M., Kraemer, D.F.: Enhancing access to the bibliome: the trec 2004 genomics track. Journal of Biomedical Discovery and Collaboration 2006 1(3), 10 (2006)

    Google Scholar 

  10. Herskovic, J., Tanaka, L., Hersh, W., Bernstam, E.: A day in the life of PubMed: Analysis of a typical days query log. Journal of the American Medical Informatics Association 14(2), 212–220 (2007)

    Article  Google Scholar 

  11. Jiang, J., Zhai, C.: An empirical study of tokenization strategies for biomedical information retrieval. Information Retrieval 10(4-5), 341–363 (2007)

    Article  Google Scholar 

  12. Käki, M., Aula, A.: Controlling the complexity in comparing search user interfaces via user studies. Information Processing and Management 44(1), 82–91 (2008); Evaluation of Interactive Information Retrieval Systems

    Article  Google Scholar 

  13. Kelly, D., Harper, D.J., Landau, B.: Questionnaire mode effects in interactive information retrieval experiments. Information Processing and Management 44(1), 122–141 (2008); Evaluation of Interactive Information Retrieval Systems

    Google Scholar 

  14. Krauthammer, M., Nenadic, G.: Term identification in the biomedical literature. Journal of Biomedical Informatics 37(6), 512–526 (2004)

    Article  Google Scholar 

  15. Lowe, H.J., Barnett, G.O.: Understanding and using the medical subject headings (MeSH) vocabulary to perform literature searches. JAMA 271(14), 1103–1108 (1994)

    Article  Google Scholar 

  16. Muller, H.-M., Kenny, E.E., Sternberg, P.W.: Textpresso: an ontology-based information retrieval and extraction system for biological literature. PLoS Biol. 2(11), e309 (2004)

    Google Scholar 

  17. Netzel, R., Perez-Iratxeta, C., Bork, P., Andrade, M.A.: The way we write. EMBO Reports 4(5), 446–451 (2003)

    Article  Google Scholar 

  18. Robertson, S., Zaragoza, H., Taylor, M.: Simple bm25 extension to multiple weighted fields. In: CIKM 2004: Proceedings of the thirteenth ACM international conference on Information and knowledge management, pp. 42–49. ACM, Washington (2004)

    Google Scholar 

  19. Robertson, S.E., Jones, K.S.: Simple proven approaches to text retrieval. Technical Report 356, University of Cambridge (1994)

    Google Scholar 

  20. Salton, G., Buckley, C.: Term-weighting approaches in automatic text retrieval. Information Processing and Management 24(5), 513–523 (1988)

    Article  Google Scholar 

  21. Trieschnigg, D., Kraaij, W., de Jong, F.: The influence of basic tokenization on biomedical document retrieval. In: Proceedings of the 30th international ACM SIGIR conference on Research and development in information retrieval (SIGIR 2007), p. 803 (2007)

    Google Scholar 

  22. Voorhees, E.M.: On test collections for adaptive information retrieval. Inf. Process. Manage. 44(6), 1879–1885 (2008)

    Article  Google Scholar 

  23. Wilkinson, R.: Effective retrieval of structured documents. In: Proceedings of the 17th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 1994, pp. 311–317. Springer, New York (1994)

    Google Scholar 

  24. Yilmaz, E., Aslam, J.A.: Estimating average precision when judgments are incomplete. Knowledge and Information Systems 16(2), 173–211 (2008)

    Article  Google Scholar 

  25. Zhai, C.: Notes on the lemur TFIDF model. note with lemur 1.9 documentation. Technical report, School of CS, CMU (2001)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Ramampiaro, H. (2010). BioMedical Information Retrieval: The BioTracer Approach. In: Khuri, S., Lhotská, L., Pisanti, N. (eds) Information Technology in Bio- and Medical Informatics, ITBAM 2010. ITBAM 2010. Lecture Notes in Computer Science, vol 6266. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15020-3_14

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-15020-3_14

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-15019-7

  • Online ISBN: 978-3-642-15020-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics