Skip to main content

Select, Link and Rank: Diversified Query Expansion and Entity Ranking Using Wikipedia

  • Conference paper
  • First Online:
Web Information Systems Engineering – WISE 2016 (WISE 2016)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10041))

Included in the following conference series:

Abstract

A search query, being a very concise grounding of user intent, could potentially have many possible interpretations. Search engines hedge their bets by diversifying top results to cover multiple such possibilities so that the user is likely to be satisfied, whatever be her intended interpretation. Diversified Query Expansion is the problem of diversifying query expansion suggestions, so that the user can specialize the query to better suit her intent, even before perusing search results. We propose a method, Select-Link-Rank, that exploits semantic information from Wikipedia to generate diversified query expansions. SLR does collective processing of terms and Wikipedia entities in an integrated framework, simultaneously diversifying query expansions and entity recommendations. SLR starts with selecting informative terms from search results of the initial query, links them to Wikipedia entities, performs a diversity-conscious entity scoring and transfers such scoring to the term space to arrive at query expansion suggestions. Through an extensive empirical analysis and user study, we show that our method outperforms the state-of-the-art diversified query expansion and diversified entity recommendation techniques.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://en.wikipedia.org/wiki/Pythonidae.

  2. 2.

    https://en.wikipedia.org/wiki/Monty_Python.

  3. 3.

    https://en.wikipedia.org/wiki/Python_(programming_language).

  4. 4.

    Terms may have associated weights.

  5. 5.

    https://en.wikipedia.org/wiki/Jaguar.

  6. 6.

    http://www.jaguar.co.uk/.

  7. 7.

    https://en.wikipedia.org/wiki/Jaguar_Racing.

  8. 8.

    http://www.retrogamer.net/profiles/hardware/atari-jaguar-2/.

  9. 9.

    http://www.jaguars.com/.

  10. 10.

    The other option, using sum instead of max, could cause some highly connected nodes in \(N_2\) to have much higher weights than those in \(N_1\).

  11. 11.

    P. Onca is the scientific name of the wild cat called Jaguar.

  12. 12.

    https://sites.google.com/site/slrcompanion2016/.

References

  1. Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)

    MATH  Google Scholar 

  2. Bouchoucha, A., He, J., Nie, J.Y.: Diversified query expansion using conceptnet. In: Proceedings of the 22nd ACM International Conference on Conference on Information and Knowledge Management, pp. 1861–1864. ACM (2013)

    Google Scholar 

  3. Bouchoucha, A., Liu, X., Nie, J.-Y.: Integrating multiple resources for diversified query expansion. In: Rijke, M., Kenter, T., Vries, A.P., Zhai, C.X., Jong, F., Radinsky, K., Hofmann, K. (eds.) ECIR 2014. LNCS, vol. 8416, pp. 437–442. Springer, Heidelberg (2014). doi:10.1007/978-3-319-06028-6_38

    Chapter  Google Scholar 

  4. Bouchoucha, A., Liu, X., Nie, J.-Y.: Towards query level resource weighting for diversified query expansion. In: Hanbury, A., Kazai, G., Rauber, A., Fuhr, N. (eds.) ECIR 2015. LNCS, vol. 9022, pp. 1–12. Springer, Heidelberg (2015). doi:10.1007/978-3-319-16354-3_1

    Google Scholar 

  5. Carbonell, J., Goldstein, J.: The use of mmr, diversity-based reranking for reordering documents and producing summaries. In: Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 335–336. ACM (1998)

    Google Scholar 

  6. Ceccarelli, D., Lucchese, C., Orlando, S., Perego, R., Trani, S.: Dexter 2.0 - an open source tool for semantically enriching data. In: Proceedings of the ISWC 2014 Posters and Demonstrations Track a Track within the 13th International Semantic Web Conference, ISWC 2014, Riva del Garda, Italy, October 21, 2014, pp. 417–420 (2014)

    Google Scholar 

  7. Clueweb: (2009). http://lemurproject.org/clueweb09/

  8. Collins-Thompson, K.: Estimating robust query models with convex optimization. In: Advances in Neural Information Processing Systems, pp. 329–336 (2009)

    Google Scholar 

  9. Dalton, J., Dietz, L., Allan, J.: Entity query feature expansion using knowledge base links. In: Proceedings of the 37th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 365–374. ACM (2014)

    Google Scholar 

  10. Deepak, P., Ranu, S., Banerjee, P., Mehta, S.: Entity linking for web search queries. In: Hanbury, A., Kazai, G., Rauber, A., Fuhr, N. (eds.) ECIR 2015. LNCS, vol. 9022, pp. 394–399. Springer, Heidelberg (2015). doi:10.1007/978-3-319-16354-3_43

    Google Scholar 

  11. Dou, Z., Hu, S., Chen, K., Song, R., Wen, J.R.: Multi-dimensional search result diversification. In: Proceedings of the Fourth ACM International Conference on Web Search and Data Mining, pp. 475–484. ACM (2011)

    Google Scholar 

  12. Ferragina, P., Scaiella, U.: Tagme: on-the-fly annotation of short text fragments (by wikipedia entities). In: Proceedings of the 19th ACM International Conference on Information and Knowledge Management, pp. 1625–1628. ACM (2010)

    Google Scholar 

  13. Gabrilovich, E., Markovitch, S.: Computing semantic relatedness using wikipedia-based explicit semantic analysis. IJCAI 7, 1606–1611 (2007)

    Google Scholar 

  14. He, B., Ounis, I.: Combining fields for query expansion and adaptive query expansion. Inf. Process. Manage. 43(5), 1294–1307 (2007)

    Article  Google Scholar 

  15. He, J., Hollink, V., de Vries, A.: Combining implicit and explicit topic representations for result diversification. In: Proceedings of the 35th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 851–860. ACM (2012)

    Google Scholar 

  16. Jakarta, A.: Apache lucene-a high-performance, full-featured text search engine library (2004)

    Google Scholar 

  17. Liu, X., Bouchoucha, A., Sordoni, A., Nie, J.Y.: Compact aspect embedding for diversified query expansions. Proc. AAAI 14, 115–121 (2014)

    Google Scholar 

  18. Page, L., Brin, S., Motwani, R., Winograd, T.: The pagerank citation ranking: Bringing order to the web. In: Proceedings of the 7th International World Wide Web Conference, pp. 161–172 (1998)

    Google Scholar 

  19. Pemantle, R.: Vertex-reinforced random walk. Probab. Theor. Relat. Fields 92(1), 117–136 (1992)

    Article  MathSciNet  MATH  Google Scholar 

  20. Santos, R.L., Macdonald, C., Ounis, I.: Exploiting query reformulations for web search result diversification. In: Proceedings of the 19th International Conference on World Wide Web, pp. 881–890. ACM (2010)

    Google Scholar 

  21. Santos, R.L.T., Peng, J., Macdonald, C., Ounis, I.: Explicit search result diversification through sub-queries. In: Gurrin, C., He, Y., Kazai, G., Kruschwitz, U., Little, S., Roelleke, T., Rüger, S., Rijsbergen, K. (eds.) ECIR 2010. LNCS, vol. 5993, pp. 87–99. Springer, Heidelberg (2010). doi:10.1007/978-3-642-12275-0_11

    Chapter  Google Scholar 

  22. Schuhmacher, M., Ponzetto, S.P.: Knowledge-based graph document modeling. In: Proceedings of the 7th ACM International Conference on Web Search and Data Mining, pp. 543–552. ACM (2014)

    Google Scholar 

  23. Singh, A., Raghu, D., et al.: Retrieving similar discussion forum threads: a structure based approach. In: Proceedings of the 35th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 135–144. ACM (2012)

    Google Scholar 

  24. Song, R., Luo, Z., Wen, J.R., Yu, Y., Hon, H.W.: Identifying ambiguous queries in web search. In: Proceedings of the 16th International Conference on World Wide Web, pp. 1169–1170. ACM (2007)

    Google Scholar 

  25. Strohman, T., Metzler, D., Turtle, H., Croft, W.B.: Indri: A language model-based search engine for complex queries. In: Proceedings of the International Conference on Intelligent Analysis. vol. 2, pp. 2–6. Citeseer (2005)

    Google Scholar 

  26. Vargas, S., Santos, R.L., Macdonald, C., Ounis, I.: Selecting effective expansion terms for diversity. In: Proceedings of the 10th Conference on Open Research Areas in Information Retrieval, pp. 69–76 (2013)

    Google Scholar 

  27. Whissell, J.S., Clarke, C.L.: Improving document clustering using okapi bm25 feature weighting. Inf. Retr. 14(5), 466–487 (2011)

    Article  Google Scholar 

  28. Xu, Y., Jones, G.J., Wang, B.: Query dependent pseudo-relevance feedback based on wikipedia. In: Proceedings of the 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 59–66. ACM (2009)

    Google Scholar 

  29. Zhu, X., Goldberg, A.B., Van Gael, J., Andrzejewski, D.: Improving diversity in ranking using absorbing random walks. In: HLT-NAACL, pp. 97–104. Citeseer (2007)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Adit Krishnan .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing AG

About this paper

Cite this paper

Krishnan, A., Padmanabhan, D., Ranu, S., Mehta, S. (2016). Select, Link and Rank: Diversified Query Expansion and Entity Ranking Using Wikipedia. In: Cellary, W., Mokbel, M., Wang, J., Wang, H., Zhou, R., Zhang, Y. (eds) Web Information Systems Engineering – WISE 2016. WISE 2016. Lecture Notes in Computer Science(), vol 10041. Springer, Cham. https://doi.org/10.1007/978-3-319-48740-3_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-48740-3_11

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-48739-7

  • Online ISBN: 978-3-319-48740-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics