Category-Based Query Modeling for Entity Search

Balog, Krisztian; Bron, Marc; de Rijke, Maarten

doi:10.1007/978-3-642-12275-0_29

Krisztian Balog²⁴,
Marc Bron²⁴ &
Maarten de Rijke²⁴

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5993))

Included in the following conference series:

European Conference on Information Retrieval

2235 Accesses
22 Citations

Abstract

Users often search for entities instead of documents and in this setting are willing to provide extra input, in addition to a query, such as category information and example entities. We propose a general probabilistic framework for entity search to evaluate and provide insight in the many ways of using these types of input for query modeling. We focus on the use of category information and show the advantage of a category-based representation over a term-based representation, and also demonstrate the effectiveness of category-based expansion using example entities. Our best performing model shows very competitive performance on the INEX-XER entity ranking and list completion tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Balog, K.: People Search in the Enterprise. PhD thesis, University of Amsterdam (2008)
Google Scholar
Balog, K., Azzopardi, L., de Rijke, M.: Formal models for expert finding in enterprise corpora. In: SIGIR 2006, pp. 43–50 (2006)
Google Scholar
Balog, K., Weerkamp, W., de Rijke, M.: A few examples go a long way. In: SIGIR 2008, pp. 371–378 (2008)
Google Scholar
Balog, K., Soboroff, I., Thomas, P., Craswell, N., de Vries, A.P., Bailey, P.: Overview of the TREC 2008 enterprise track. In: TREC 2008, NIST (2009)
Google Scholar
Chu-Carroll, J., Czuba, K., Prager, J., Ittycheriah, A., Blair-Goldensohn, S.: IBM’s PIQUANT II in TREC 2004. In: Proceedings TREC 2004 (2004)
Google Scholar
Conrad, J., Utt, M.: A system for discovering relationships by feature extraction from text databases. In: SIGIR 1994, pp. 260–270 (1994)
Google Scholar
Craswell, N., Demartini, G., Gaugaz, J., Iofciu, T.: L3S at INEX2008: retrieving entities using structured information. In: Geva, et al. (eds.) [12], pp. 253–263
Google Scholar
de Vries, A., Vercoustre, A.-M., Thom, J.A., Craswell, N., Lalmas, M.: Overview of the INEX 2007 entity ranking track. In: Fuhr, et al. (eds.) [11], pp. 245–251
Google Scholar
Demartini, G., de Vries, A., Iofciu, T., Zhu, J.: Overview of the INEX 2008 entity ranking track. In: Geva, et al. (eds.) [12], pp. 243–252
Google Scholar
Fissaha Adafre, S., de Rijke, M., Tjong Kim Sang, E.: Entity retrieval. In: Recent Advances in Natural Language Processing (RANLP 2007) (September 2007)
Google Scholar
Fuhr, N., Kamps, J., Lalmas, M., Trotman, A. (eds.): INEX 2007. LNCS, vol. 4862. Springer, Heidelberg (2008)
Google Scholar
Geva, S., Kamps, J., Trotman, A. (eds.): INEX 2008. LNCS, vol. 5631. Springer, Heidelberg (2009)
Google Scholar
Ghahramani, Z., Heller, K.A.: Bayesian sets. In: NIPS 2005 (2005)
Google Scholar
GoogleSets (2009), http://labs.google.com/sets (accessed January 2009)
Jämsen, J., Näppilä, T., Arvola, P.: Entity ranking based on category expansion. In: Fuhr, et al. (eds.) [11], pp. 264–278
Google Scholar
Jiang, J., Liu, W., Rong, X., Gao, Y.: Adapting language modeling methods for expert search to rank wikipedia entities. In: Geva, et al. (eds.) [12], pp. 264–272
Google Scholar
Kaptein, R., Kamps, J.: Finding entities in wikipedia using links and categories. In: Geva, et al. (eds.) [12], pp. 273–279
Google Scholar
Lafferty, J., Zhai, C.: Document language models, query models, and risk minimization for information retrieval. In: SIGIR 2001, pp. 111–119 (2001)
Google Scholar
Losada, D., Azzopardi, L.: An analysis on document length retrieval trends in language modeling smoothing. Information Retrieval 11(2), 109–138 (2008)
Article Google Scholar
Mishne, G., de Rijke, M.: A study of blog search. In: Lalmas, M., MacFarlane, A., Rüger, S.M., Tombros, A., Tsikrika, T., Yavlinsky, A. (eds.) ECIR 2006. LNCS, vol. 3936, pp. 289–301. Springer, Heidelberg (2006)
Chapter Google Scholar
Raghavan, H., Allan, J., Mccallum, A.: An exploration of entity models, collective classification and relation description. In: Link KDD 2004 (2004)
Google Scholar
Rose, D.E., Levinson, D.: Understanding user goals in web search. In: WWW 2004, pp. 13–19 (2004)
Google Scholar
Sayyadian, M., Shakery, A., Doan, A., Zhai, C.: Toward entity retrieval over structured and text data. In: WIRD 2004 (2004)
Google Scholar
Song, F., Croft, W.B.: A general language model for information retrieval. In: CIKM 1999, pp. 316–321 (1999)
Google Scholar
Tsikrika, T., Serdyukov, P., Rode, H., Westerveld, T., Aly, R., Hiemstra, D., de Vries, A.P.: Structured document retrieval, multimedia retrieval, and entity ranking using PF/Tijah. In: Fuhr, et al. (eds.) [11], pp. 306–320
Google Scholar
Vercoustre, A.-M., Pehcevski, J., Thom, J.A.: Using wikipedia categories and links in entity ranking. In: Fuhr, et al. (eds.) [11], pp. 321–335
Google Scholar
Vercoustre, A.-M., Thom, J.A., Pehcevski, J.: Entity ranking in wikipedia. In: SAC 2008, pp. 1101–1106 (2008)
Google Scholar
Vercoustre, A.-M., Pehcevski, J., Naumovski, V.: Topic difficulty prediction in entity ranking. In: Geva, et al. (eds.) [12], pp. 280–291
Google Scholar
Voorhees, E.: Overview of the TREC 2004 question answering track. In: Proceedings of TREC 2004 (2005) NIST Special Publication: SP 500–261
Google Scholar
Weerkamp, W., He, J., Balog, K., Meij, E.: A generative language modeling approach for ranking entities. In: Geva, et al. (eds.) [12], pp. 292–299
Google Scholar
Yilmaz, E., Kanoulas, E., Aslam, J.A.: A simple and efficient sampling method for estimating AP and NDCG. In: SIGIR 2008, pp. 603–610 (2008)
Google Scholar
Zaragoza, H., Rode, H., Mika, P., Atserias, J., Ciaramita, M., Attardi, G.: Ranking very many typed entities on wikipedia. In: CIKM 2007, pp. 1015–1018 (2007)
Google Scholar
Zhai, C., Lafferty, J.: A study of smoothing methods for language models applied to information retrieval. ACM Trans. Inf. Syst. 22(2), 179–214 (2004)
Article Google Scholar
Zhu, J., Song, D., Rüger, S.: Integrating document features for entity ranking. In: Fuhr, et al. (eds.) [11], pp. 336–347
Google Scholar

Download references

Author information

Authors and Affiliations

ISLA, University of Amsterdam, Science Park 107, 1098, XG, Amsterdam, The Netherlands
Krisztian Balog, Marc Bron & Maarten de Rijke

Authors

Krisztian Balog
View author publications
You can also search for this author in PubMed Google Scholar
Marc Bron
View author publications
You can also search for this author in PubMed Google Scholar
Maarten de Rijke
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Adaptive Information Cluster, Dublin City University, Dublin, 9, Ireland
Cathal Gurrin
The Open University, Walton Hall, MK7 6HF, Milton Keynes, UK
Yulan He
Microsoft Research Ltd, 7 JJ Thomson Avenue, CB3 0FB, Cambridge, UK
Gabriella Kazai
Department of Computer Science, University of Essex, Wivenhoe Park, CO4 3SQ, Colchester, UK
Udo Kruschwitz
The Open University, Walton Hall, Milton Keynes, UK
Suzanne Little
University of London, London, UK
Thomas Roelleke
Knowledge Media Institute, The Open University, MK7 6AA, Milton Keynes, UK
Stefan Rüger
Department of Computing Science, University of Glasgow, 17 Lilybank Gardens, G12 8QQ, Glasgow, UK
Keith van Rijsbergen

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Balog, K., Bron, M., de Rijke, M. (2010). Category-Based Query Modeling for Entity Search. In: Gurrin, C., et al. Advances in Information Retrieval. ECIR 2010. Lecture Notes in Computer Science, vol 5993. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-12275-0_29

Download citation

DOI: https://doi.org/10.1007/978-3-642-12275-0_29
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-12274-3
Online ISBN: 978-3-642-12275-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics