Abstract
In this paper, we propose a cost-based re-ranking method to promote ranking diversity for biomedical information retrieval. The proposed method concerns with finding passages that cover many different aspects of a query topic. First, aspects covered by retrieved passages are detected and explicitly presented by Wikipedia concepts. Then, an aspect filter based on a two-stage model is introduced. It ranks the detected aspects in decreasing order of the probability that an aspect is generated by the query. Finally, retrieved passages are re-ranked using the proposed cost-based re-ranking method which ranks a passage according to the number of new aspects covered by the passage and the query-relevance of aspects covered by the passage. A series of experiments conducted on the TREC 2006 and 2007 Genomics collections demonstrate the effectiveness of the proposed method in promoting ranking diversity for biomedical information retrieval.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Hersh, W., Cohen, A., Ruslen, L., Roberts, P.: TREC 2007 Genomics track overview. In: Proc. of TREC-16 (2007)
Over, P.: TREC-6 Interactive track report. In: Proc. of TREC-6 (1998)
Over, P.: TREC-7 Interactive track report. In: Proc. of TREC-7 (1999)
Hersh, W., Over, P.: TREC-8 Interactive track report. In: Proc. of TREC-8 (2000)
Hersh, W., Cohen, A., Roberts, P., Rekapalli, H.: TREC 2006 Genomics track overview. In: Proc. of TREC-15 (2006)
Carbonell, J., Goldstein, J.: The use of MMR, diversity-based reranking for reordering documents and producing summaries. In: Proc. of the 21st ACM SIGIR (1998)
Zhang, Y., Callan, J., Minka, T.: Novelty and redundancy detection in adaptive filtering. In: Proc. of the 25th ACM SIGIR (2002)
Zhai, C., Cohen, W.W., Lafferty, J.: Beyond independent relevance: methods and evaluation metrics for subtopic retrieval. In: Proc. of the 26th ACM SIGIR (2003)
Goldberg, A.B., Andrzejewski, D., Gael, J.V., Settles, B., Zhu, X., Craven, M.: Ranking biomedical passages for relevance and diversity: University of Wisconsin, Madison at TREC Genomics 2006. In: Proc. of TREC-15 (2006)
Zhu, X., Goldberg, A., Gael, J.V., Andrzejewski, D.: Improving diversity in ranking using absorbing random walks. In: Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Proc. of the Main Conference (2007)
Demner-Fushman, D., Humphrey, S.M., Ide, N.C., Loane, R.F., Mork, J.G., Ruch, P., Ruiz, M.E., Smith, L.H., Wilbur, W.J., Aronsona, A.R.: Combining resources to find answers to biomedical questions. In: Proc. of TREC-16 (2007)
Zhou, W., Yu, C.: TREC Genomics track at UIC. In: Proc. of TREC-16 (2007)
Huang, X., Hu, Q.: A bayesian learning approach to promoting diversity in ranking for biomedical information retrieval. In: Proc. of the 32nd ACM SIGIR (2009)
Yu, Y., Jones, G.J., Wang, B.: Query dependent pseudo-relevance feedback based on Wikipedia. In: Proc. of the 32nd ACM SIGIR (2009)
Ye, Z., Huang, X., Lin, H.: A graph-based approach to mining multilingual word associations from Wikipedia. In: Proc. of the 32nd ACM SIGIR (2009)
Medelyan, O., Witten, I., Milne, D.: Topic indexing with Wikipedia. In: Proc. of AAAI Workshop on Wikipedia and Artificial Intelligence (2008)
Milne, D.N., Witten, I.H., Nichols, D.M.: A knowledge-based search engine powered by Wikipedia. In: Proc. of the 16th ACM CIKM (2007)
Gabrilovich, E., Markovitch, S.: Computing semantic relatedness using Wikipedia-based explicit semantic analysis. In: Proc. of the 20th IJCAI (2007)
Hersh, W., Buckley, C., Leone, T., Hickam, D.: OHSUMED: An interactive retrieval evaluation and new large test collection for research. In: Proc. of the 17th ACM SIGIR (1994)
Srinivasan, P.: Optimal document-indexing vocabulary for MEDLINE. Information Processing and Management 32(5), 503–514 (1996)
Savoy, J.: Bibliographic database access using free-text and controlled vocabulary: An evaluation. Information Processing and Management 41(4), 873–890 (2005)
Cimino, J.J.: Vocabulary and health care information technology: State of the art. Journal of the American Society for Information Science 46(10), 725–800 (1995)
Stokes, N., Li, Y., Cavedon, L., Huang, E., Rong, J., Zobel, J.: Entity-based relevance feedback for genomic list answer retrieval. In: Proc. of TREC-16 (2007)
Huang, X., Zhong, M., Si, L.: York University at TREC 2005: Genomics track. In: Proc. of TREC-14 (2005)
Huang, A., Milne, D., Frank, E., Witten, I.H.: Clustering documents with active learning using Wikipedia. In: Proc. of the 8th IEEE ICDM (2008)
Cao, Y., Liu, J., Bao, S., Li, H.: Research on expert search at Enterprise track of TREC 2005. In: Proc. of TREC-14 (2005)
Zhu, J., Huang, X., Song, D., Ruger, S.: Integrating multiple document features in language models for expert finding. Knowledge and Information Systems (2009)
Beaulieu, M., Gatford, M., Huang, X., Robertson, S., Walker, S., William, P.: Okapi at TREC-5. In: Proc. of TREC-5 (1997)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Yin, X., Huang, X., Li, Z. (2010). Promoting Ranking Diversity for Biomedical Information Retrieval Using Wikipedia. In: Gurrin, C., et al. Advances in Information Retrieval. ECIR 2010. Lecture Notes in Computer Science, vol 5993. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-12275-0_43
Download citation
DOI: https://doi.org/10.1007/978-3-642-12275-0_43
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-12274-3
Online ISBN: 978-3-642-12275-0
eBook Packages: Computer ScienceComputer Science (R0)