Question Processing and Clustering in INDOC: A Biomedical Question Answering System

  • Parikshit Sondhi
  • Purushottam Raj
  • V Vinod Kumar
  • Ankush Mittal
Open Access
Research Article


The exponential growth in the volume of publications in the biomedical domain has made it impossible for an individual to keep pace with the advances. Even though evidence-based medicine has gained wide acceptance, the physicians are unable to access the relevant information in the required time, leaving most of the questions unanswered. This accentuates the need for fast and accurate biomedical question answering systems. In this paper we introduce INDOC—a biomedical question answering system based on novel ideas of indexing and extracting the answer to the questions posed. INDOC displays the results in clusters to help the user arrive the most relevant set of documents quickly. Evaluation was done against the standard OHSUMED test collection. Our system achieves high accuracy and minimizes user effort.


Relevant Information Exponential Growth System Biology Require Time Wide Acceptance 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
  2. 2.
    Gorman P, Ash J, Wykoff L: Can primary care physicians' questions be answered using the medical journal literature? Bulletin of the Medical Library Association 1994, 82(2):140-146.Google Scholar
  3. 3.
    Straus SE, Sackett DL: Bringing evidence to the point of care. Journal of the American Medical Association 1999, 281: 1171-1172. 10.1001/jama.281.13.1171CrossRefGoogle Scholar
  4. 4.
    Guyatt GH, Meade MO, Jaeschke RZ, Cook DJ, Haynes RB: Practitioners of evidence based care. British Medical Journal 2000, 320(7240):954-955. 10.1136/bmj.320.7240.954CrossRefGoogle Scholar
  5. 5.
    Sackett DL, Straus SE, Richardson WS, Rosenberg W, Haynes RB: Evidence-Based Medicine: How to Practice and Teach ENB. Churchill Livingstone, New York, NY, USA; 1997.Google Scholar
  6. 6.
    Gorman PN, Helfand M: Information seeking in primary care: how physicians choose which clinical questions to pursue and which to leave unanswered. Medical Decision Making 1995, 15(2):113-119. 10.1177/0272989X9501500203CrossRefGoogle Scholar
  7. 7.
    Jacquemart P, Zweigenbaum P: Towards a medical question-answering system: a feasibility study. In Proceedings of Medical Informatics Europe (MIE '03), Studies in Health Technology and Informatics. Volume 95. Edited by: Beux PL, Baud R. IOS Press, San Palo, Calif, USA; 2003:463-468.Google Scholar
  8. 8.
    Schultz S, Honeck M, Hahn H: Biomedical text retrieval in languages with complex morphology. Proceedings of the Workshop on Natural Language Processing in the Biomedical domain, Philadelphia, Pa, USA, July 2002 61-68.Google Scholar
  9. 9.
    Ely J, Osheroff JA, Ebell MH: Analysis of questions asked by family doctors regarding patient care. British Medical Journal 1999, 319(7206):358-361.CrossRefGoogle Scholar
  10. 10.
    Ely JW, Osheroff JA, Ebell MH, et al.: Obstacles to answering doctors' questions about patient care with evidence: qualitative study. British Medical Journal 2002, 324(7339):710-713. 10.1136/bmj.324.7339.710CrossRefGoogle Scholar
  11. 11.
    Bergus GR, Randall CS, Sinift SD, Rosenthal DM: Does the structure of clinical questions affect the outcome of curbside consultations with specialty colleagues? Archives of Family Medicine 2000, 9(6):541-547. 10.1001/archfami.9.6.541CrossRefGoogle Scholar
  12. 12.
    Niu Y, Hirst G: Analysis of semantic classes in medical text for question answering. Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics, Workshop on Question Answering in Restricted Domains, Barcelona, Spain, July 2004 54-61.Google Scholar
  13. 13.
    Niu Y, Hirst G, McArthur G, Rodriguez-Gianolli P: Answering clinical questions with role identification. Proceedings of 41st Annual Meeting of the Association for Computational Linguistics, Workshop on Natural Language Processing in Biomedicine, Sapporo, Japan, July 2003 73-80.Google Scholar
  14. 14.
    Sang ETK, Bouma G, De Rijke M: Developing offline strategies for answering medical questions. Proceedings of the AAAI-05 Workshop on Question Answering in Restricted Domains, Pittsburgh, Pa, USA, 2005 WS-05-10: 41-45.Google Scholar
  15. 15.
    Cohen AM, Hersh WR: A survey of current work in biomedical text mining. Briefings in Bioinformatics 2005, 6(1):57-71. 10.1093/bib/6.1.57CrossRefGoogle Scholar
  16. 16.
  17. 17.
  18. 18.
    Aronson AR: Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program. Proceedings of the AMIA Symposium, 2001 17-21.Google Scholar
  19. 19.
    McCray AT, Burgun A, Bodenreider O: Aggregating UMLS semantic types for reducing conceptual complexity. Medinfo 2001, 10(part 1):216-220.Google Scholar
  20. 20.
    Bodenreider O, McCray AT: Exploring semantic groups through visual approaches. Journal of Biomedical Informatics 2003, 36(6):414-432. 10.1016/j.jbi.2003.11.002CrossRefGoogle Scholar
  21. 21.
    Hersh WR: OHSUMED: an interactive retrieval evaluation and new large test collection for research. Proceedings of the 17th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '94), Springer, Dublin, Ireland, July 1994 192-201.Google Scholar
  22. 22.
  23. 23.
    MacQueen JB: Some methods for classification and analysis of multivariate observations. Proceedings of 5th the Berkeley Symposium on Mathematical Statistics and Probability, Berkeley, Calif, University of California Press, USA, June-July 1967 281-297.Google Scholar
  24. 24.
    Ely JW, Osheroff JA, Gorman PN, et al.: A taxonomy of generic clinical questions: classification study. British Medical Journal 2000, 321(7258):429-432. 10.1136/bmj.321.7258.429CrossRefGoogle Scholar

Copyright information

© Parikshit Sondhi et al. 2007

This article is published under license to BioMed Central Ltd. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Authors and Affiliations

  • Parikshit Sondhi
    • 1
  • Purushottam Raj
    • 1
  • V Vinod Kumar
    • 1
  • Ankush Mittal
    • 1
  1. 1.Department of Electronics and Computer EngineeringIndian Institute of Technology RoorkeeRoorkeeIndia

Personalised recommendations