This volume is filled with a variety of innovative approaches to helping users answer questions. In much of the research, however, one part of the solution is missing, namely the user. This chapter describes evaluation of interactive question answering with a focus on two initiatives: the Text REtrieval Conference (TREC) Interactive Track and studies in the medical domain. As is seen, there is considerable overlap between the two in terms of the model underlying the research and the methods used.
Keywords
- Medical Student
- Information Retrieval
- Relevance Feedback
- Mean Average Precision
- Information Retrieval System
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
5. References
Abraham, V., Friedman, C., et al. (1999). Student and faculty performance in clinical simulations with access to a searchable information resource. Proceedings of the AMIA 1999 Annual Symposium, Washington, DC. Hanley & Belfus. 648-652.
Allan, J. (1997). Building hypertext using information retrieval. Information Processing and Management, 33: 145-160.
Allen, B. (1992). Cognitive differences in end-user searching of a CD-ROM index. Proceedings of the 15th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Copenhagen, Denmark. ACM Press. 298-309.
Belkin, N., Cool, C., et al. (2000). Iterative exploration, design and evaluation of support for query reformulation in interactive information retrieval. Information Processing and Management, 37: 403-434.
Belkin, N., Keller, A., et al. (2000). Support for question-answering in interactive information retrieval: Rutger’s TREC-9 interactive track experience. The Ninth Text REtrieval Conference (TREC-9), Gaithersburg, MD. National Institute of Standards and Technology. 463-474.
Chin, J., Diehl, V., et al. (1988). Development of an instrument measuring user satisfaction of the human-computer interface. Proceedings of CHI’88 - Human Factors in Computing Systems, New York. ACM Press. 213-218.
Cleverdon, C. and Keen, E. (1966). Factors determining the performance of indexing systems (Vol. 1: Design, Vol. 2: Results). Cranfield, England, Aslib Cranfield Research Project.
deBliek, R., Friedman, C., et al. (1994). Information retrieved from a database and the augmentation of personal knowledge. Journal of the American Medical Informatics Association, 1: 328-338.
Egan, D., Remde, J., et al. (1989). Formative design-evaluation of Superbook. ACM Transactions on Information Systems, 7: 30-57.
Ekstrom, R., French, J., et al. (1976). Manual for Kit of Factor-Referenced Cognitive Tests. Princeton, NJ. Educational Testing Service.
Eysenbach, G. and Kohler, C. (2002). How do consumers search for and appraise health information on the World Wide Web? Qualitative study using focus groups, usability tests, and in-depth interviews. British Medical Journal, 324: 573-577.
Fellbaum, C., ed. (1998). WordNet: An Electronic Lexical Database. Cambridge, MA. MIT Press.
Friedman, C., Wildemuth, B., et al. (1996). A comparison of hypertext and Boolean access to biomedical information. Proceedings of the 1996 AMIA Annual Fall Symposium, Washington, DC. Hanley & Belfus. 2-6.
Gomez, L., Egan, D., et al. (1986). Learning to use a text editor: some learner characteristics that predict success. Human-Computer Interaction, 2: 1-23.
Gorman, P. and Helfand, M. (1995). Information seeking in primary care: how physicians choose which clinical questions to pursue and which to leave unanswered. Medical Decision Making, 15: 113-119.
Harman, D. (1993). Overview of the First Text REtrieval Conference. Proceedings of the 16th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Pittsburgh, PA. ACM Press. 36-47.
Hersh, W. (1994). Relevance and retrieval evaluation: perspectives from medicine. Journal of the American Society for Information Science, 45: 201-206.
Hersh, W. (2001). Interactivity at the Text Retrieval Conference (TREC). Information Processing and Management, 37: 365-366.
Hersh, W., Crabtree, M., et al. (2002). Factors associated with success for searching MEDLINE and applying evidence to answer clinical questions. Journal of the American Medical Informatics Association, 9: 283-293.
Hersh, W., Crabtree, M., et al. (2000). Factors associated with successful answering of clinical questions using an information retrieval system. Bulletin of the Medical Library Association, 88: 323-331.
Hersh, W., Elliot, D., et al. (1994). Towards new measures of information retrieval evaluation. Proceedings of the 18th Annual Symposium on Computer Applications in Medical Care, Washington, DC. Hanley & Belfus. 895-899.
Hersh, W. and Over, P. (2000). TREC-9 Interactive Track report. The Ninth Text REtrieval Conference (TREC-9), Gaithersburg, MD. National Institute of Standards and Technology. 41-50.
Hersh, W., Pentecost, J., et al. (1996). A task-oriented approach to information retrieval evaluation. Journal of the American Society for Information Science, 47: 50-56.
Hersh, W., Turpin, A., et al. (2000a). Do batch and user evaluations give the same results? Proceedings of the 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Athens, Greece. ACM Press. 17-24.
Hersh, W., Turpin, A., et al. (2001). Challenging conventional assumptions of automated information retrieval with real users: Boolean searching and batch retrieval evaluations. Information Processing and Management, 37: 383-402.
Hersh, W., Turpin, A., et al. (2000b). Further analysis of whether batch and user evaluations give the same results with a question-answering task. The Ninth Text REtrieval Conference (TREC-9), Gaithersburg, MD. National Institute of Standards and Technology. 407-416.
Hu, F., Goldberg, J., et al. (1998). Comparison of population-averaged and subject-specific approaches for analyzing repeated binary outcomes. American Journal of Epidemiology, 147: 694-703.
Lagergren, E. and Over, P. (1998). Comparing interactive information retrieval systems across sites: the TREC-6 interactive track matrix experiment. Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Melbroune, Australia. ACM Press. 162-172.
Meadow, C. (1985). Relevance. Journal of the American Society for Information Science, 36: 354-355.
Mynatt, B., Leventhal, L., et al. (1992). Hypertext or book: which is better for answering questions? Proceedings of Computer-Human Interface 92. 19-25.
Osheroff, J. and Bankowitz, R. (1993). Physicians’ use of computer software in answering clinical questions. Bulletin of the Medical Library Association, 81: 11-19.
Over, P. (2000). The TREC Interactive Track: an annotated bibliography. Information Processing and Management,: 369-382.
Robertson, S. and Walker, S. (1994). Some simple effective approximations to the 2-Poisson model for probabilistic weighted retrieval. Proceedings of the 17th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Dublin, Ireland. SpringerVerlag. 232-241.
Robertson, S., Walker, S., et al. (1998). Okapi at TREC-7: automatic ad hoc, filtering, VLC, and interactive track. The Seventh Text REtrieval Conference (TREC-7), Gaithersburg, MD. National Institute of Standards and Technology. 253-264.
Rose, L. (1998). Factors Influencing Successful Use of Information Retrieval Systems by Nurse Practitioner Students. School of Nursing. M.S. Thesis. Oregon Health Sciences University.
Rose, L., Crabtree, K., et al. (1998). Factors influencing successful use of information retrieval systems by nurse practitioner students. Proceedings of the AMIA 1998 Annual Symposium, Orlando, FL. Hanley & Belfus. 1067.
Salton, G. (1991). Developments in automatic text retrieval. Science, 253: 974-980.
Saracevic, T. and Kantor, P. (1988a). A study in information seeking and retrieving. II. Users, questions, and effectiveness. Journal of the American Society for Information Science, 39: 177-196.
Saracevic, T. and Kantor, P. (1988b). A study of information seeking and retrieving. III. Searchers, searches, and overlap. Journal of the American Society for Information Science, 39: 197-216.
Saracevic, T., Kantor, P., et al. (1988). A study of information seeking and retrieving. I. Background and methodology. Journal of the American Society for Information Science, 39: 161-176.
Singhal, A., Buckley, C., et al. (1996). Pivoted document length normalization. Proceedings of the 19th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Zurich, Switzerland. ACM Press. 21-29.
Sparck-Jones, K. (1981). Information Retrieval Experiment. London, Butterworths.
Staggers, N. and Mills, M. (1994). Nurse-computer interaction: staff performance outcomes. Nursing Research, 43: 144-150.
Swan, R. and Allan, J. (1998). Aspect windows, 3-D visualization, and indirect comparisons of information retrieval systems. Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Melbourne, Australia. ACM Press. 173-181.
Swanson, D. (1977). Information retrieval as a trial-and-error process. Library Quarterly, 47: 128-148.
Turpin, A. and Hersh, W. (2001). Why batch and user evaluations do not give the same results. Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, New Orleans, LA. ACM Press. 225-231.
Wildemuth, B., deBliek, R., et al. (1995). Medical students’ personal knowledge, searching proficiency, and database use in problem solving. Journal of the American Society for Information Science, 46: 590-607.
Witten, I., Moffat, A., et al. (1994). Managing Gigabytes - Compressing and Indexing Documents and Images. New York, Van Nostrand Reinhold.
Wu, M., Fuller, M., et al. (2000). Using clustering and classification approaches in interactive retrieval. Information Processing and Management, 37: 459-484.
Wu, M., Fuller, M., et al. (2001). Searcher performance in question answering. Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, New Orleans, LA. ACM Press. 375-381.
Xu, J. and Croft, W. (1996). Query expansion using local and global document analysis. Proceedings of the 19th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Zurich, Switzerland. ACM Press. 4-11.
Yang, K., Maglaughlin, K., et al. (2000). Passage feedback with IRIS. Information Processing and Management, 37: 521-541.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer
About this chapter
Cite this chapter
Hersh, W. (2008). Evaluating Interactive Question Answering. In: Strzalkowski, T., Harabagiu, S.M. (eds) Advances in Open Domain Question Answering. Text, Speech and Language Technology, vol 32. Springer, Dordrecht. https://doi.org/10.1007/978-1-4020-4746-6_14
Download citation
DOI: https://doi.org/10.1007/978-1-4020-4746-6_14
Publisher Name: Springer, Dordrecht
Print ISBN: 978-1-4020-4744-2
Online ISBN: 978-1-4020-4746-6
eBook Packages: Humanities, Social Sciences and LawSocial Sciences (R0)