Abstract
Relevance Feedback methods generally suffer from topic drift caused by words ambiguity and synonymous uses of words. As a way to alleviate the inherent problem, we propose a novel query phrase expansion approach utilizing semantic annotations in Wikipedia pages, trying to enrich queries with context disambiguating phrases. Focusing on the patent domain, especially on patent search where patents are classified into a hierarchy of categories, we attempt to understand the roles of phrases and words in query expansion in determining the relevance of documents and examine their contributions to alleviating the query drift problem. Our approach is compared against Relevance Model, a state-of-the-art, to show its superiority in terms of MAP on all levels of the classification hierarchy.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Azzopardi, L., Vanderbauwhede, W., Joho, H.: Search system requirements of patent analysts. In: Proc. of SIGIR 2010 (2010)
Xue, X., Croft, W.B.: Transforming patents into prior-art queries. In: Proc. of SIGIR 2009 (2009)
Al-Shboul, B., Myaeng, S.H.: IRNLP@KAIST in the subtask of Research Papers Classification in NTCIR-8. In: Proc. of NTCIR-8 (2010)
Lavrenko, V., Croft, W.B.: Relevance-based language models. In: Proc. of SIGIR 2001 (2001)
Yin, Z., Shokouhi, M., Craswell, N.: Query Expansion Using External Evidence. In: Boughanem, M., Berrut, C., Mothe, J., Soule-Dupuy, C. (eds.) ECIR 2009. LNCS, vol. 5478, pp. 362–374. Springer, Heidelberg (2009)
Lang, H., Metzler, D., Wang, B., Li, J.T.: Improved latent concept expansion using hierarchical markov random fields. In: Proc. of CIKM 2010 (2010)
Lv, Y., Zhai, C.: Adaptive relevance feedback in information retrieval. In: Proc. of CIKM 2009 (2009)
Maxwell, K., Schafer, B.: Concept and Context in Legal Information Retrieval. In: Proc. of JURIX 2008 (2008)
Navigli, R.: Word sense disambiguation: A survey. ACM Comput. Surv. 41(2), Article 10 (2009)
Voorhees, E.: Query expansion using lexical-semantic relations. In: Proc. of SIGIR 1994 (1994)
Bai, J., Nie, J.Y.: Adapting information retrieval to query contexts. Inf. Process. Manage. 44(6), 1901–1922 (2008)
Lee, K., Croft, B., Allan, J.: A cluster-based resampling method for pseudo-relevance feedback. In: Proc. of SIGIR 2008 (2008)
Vechtomova, O., Karamuftuoglu, M., Robertson, S.: On document relevance and lexical cohesion between query terms. Information Processing & Management 42(5), 1230–1247 (2006)
Vechtomova, O., Karamuftuoglu, M.: Query expansion with terms selected using lexical cohesion analysis of documents. Information Processing & Management 43(4), 849–865 (2007)
Lewis, D., Croft, B.: Term clustering of syntactic phrases. In: Proc. of SIGIR 1990 (1989, 1990)
Koster, C., Beney, J.: Phrase-based document categorization revisited. In: Proc. of PaIR 2009 (2009)
Navigli, R., Velardi, P.: An analysis of ontology-based query expansion strategies, workshop on adaptive text extraction and mining (ATEM 2003). In: 14th European Conference on Machine Learning (ECML 2003) (2003)
Cao, G., Nie, J.Y., Gao, J., Robertson, S.: Selecting good expansion terms for pseudo-relevance feedback. In: Proc. of SIGIR 2008 (2008)
Xu, J., Croft, B.: Query expansion using local and global document analysis. In: Proc. of SIGIR 1996 (1996)
Banerjee, S., Ramanathan, K., Gupta, A.: Clustering short texts using Wikipedia. In: Proc. of SIGIR 2007 (2007)
Ganesh, S., Varma, V.: Exploiting structure and content of Wikipedia for Query Expansion in the context of Question Answering. In: Recent Advances in Natural Language Processing (RANLP 2009), Bulgaria (2009)
Xu, Y., Jones, G., Wang, B.: Query dependent pseudo-relevance feedback based on Wikipedia. In: Proc. of SIGIR 2009 (2009)
Kapalavayi, N., Murthy, S., Hu, G.: Document classification efficiency of phrase-based techniques. In: IEEE/ACS International Conference on Computer Systems and Applications (2009)
Li, Y., Luk, W., Ho, K., Chung, F.: Improving weak ad-hoc queries using Wikipedia as external corpus. In: Proc. of SIGIR 2007 (2007)
Cui, H., Wen, J., Nie, J., Ma, W.: Query Expansion by Mining User Logs. IEEE Transactions on Knowledge and Data Engineering 15(4), 829–839 (2003)
Kwok, K., Chan, M.: Improving two-stage ad-hoc retrieval for short queries. In: Proc. of SIGIR 1998 (1998)
Arampatzis, A., Tsoris, T., Koster, C., Van Der Weide, T.: Phrase-based information retrieval. Information Processing & Management 34(6), 693–707 (1998)
Arguello, J., Elsas, J.L., Callan, J., Carbonell. J.G.: Document Representation and Query Expansion Models for Blog Recommendation. In: Proc. of ICWSM 2008 (2008)
Robertson, S., Jones, K.: Relevance weighting of search terms. Journal of the American Society for Information Science 27, 129–146 (1976)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Al-Shboul, B., Myaeng, SH. (2011). Query Phrase Expansion Using Wikipedia in Patent Class Search. In: Salem, M.V.M., Shaalan, K., Oroumchian, F., Shakery, A., Khelalfa, H. (eds) Information Retrieval Technology. AIRS 2011. Lecture Notes in Computer Science, vol 7097. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-25631-8_11
Download citation
DOI: https://doi.org/10.1007/978-3-642-25631-8_11
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-25630-1
Online ISBN: 978-3-642-25631-8
eBook Packages: Computer ScienceComputer Science (R0)