Advertisement

An implicit aspect modelling framework for diversity focused query expansion

  • Rahul E. Dev
  • Vidhya BalasubramanianEmail author
Article
  • 9 Downloads

Abstract

Diversified Query Expansion aims to present the user with a diverse list of query expansions so as to better communicate their intent to the retrieval system. Current diversified expansion techniques either make use of external knowledge sources to explicitly model the various aspects and their relationships underlying the user query or implicitly model query aspects. However these techniques assume query aspects to be independent of each other. We propose a unified framework that produces diversified query expansions in a completely implicit manner while also considering the relationships between query aspects. In particular, the framework identifies query aspects and their relationships by making use of the semantic properties of context phrases that occur within the top-ranked retrieved documents for the supplied user query, and maps them onto a Mutating Markov Chain model to generate a diverse ordering of query aspects. We test our framework against a set of ambiguous and faceted queries used in the NTCIR-12 IMine-2 Task and through an extensive empirical analysis, we show that our framework consistently outperforms existing implicit diversified query expansion algorithms. The utility of our algorithm truly comes up in the second set of experiments where we generate diversified query expansions for a retrieval engine indexing documents from specific scientific domains. Even in such a niche scenario our algorithm consistently provides robust results and performs better than other implicit approaches.

Keywords

Query expansion Diversification Diversified query expansion Implicit diversification 

Notes

References

  1. Amati, G., & Van Rijsbergen, C.J. (2002). Probabilistic models of information retrieval based on measuring the divergence from randomness. ACM Transactions on Information Systems, 20(4), 357–389.  https://doi.org/10.1145/582415.582416.CrossRefGoogle Scholar
  2. Balagopalan, A., Balasubramanian, L.L., Balasubramanian, V., Chandrasekharan, N., Damodar, A. (2012). Automatic keyphrase extraction and segmentation of video lectures. In 2012 IEEE International conference on technology enhanced education (ICTEE) (pp. 1–10).Google Scholar
  3. Boldi, P., Bonchi, F., Castillo, C., Donato, D., Gionis, A., Vigna, S. (2008). The query-flow graph: model and applications. In CIKM.Google Scholar
  4. Bouchoucha, A., He, J., Nie, J.Y. (2013). Diversified query expansion using conceptnet. In Proceedings of the 22Nd ACM International Conference on Information & Knowledge Management, CIKM ’13 (pp. 1861–1864). New York: ACM.  https://doi.org/10.1145/2505515.2507881.
  5. Bouchoucha, A., Liu, X., Nie, J.Y. (2014). Integrating multiple resources for diversified query expansion. In de Rijke, M., Kenter, T., de Vries, A.P., Zhai, C., de Jong, F., Radinsky, K., Hofmann, K. (Eds.) Advances in Information Retrieval (pp. 437–442). Cham: Springer International Publishing.Google Scholar
  6. Buckley, C. (2009). Relevance feedback track overview : Trec 2008.Google Scholar
  7. Carbonell, J., & Goldstein, J. (1998). The use of mmr, diversity-based reranking for reordering documents and producing summaries. In Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ’98 (pp. 335–336). New York: ACM.  https://doi.org/10.1145/290941.291025.
  8. Chapelle, O., Metlzer, D., Zhang, Y., Grinspan, P. (2009). Expected reciprocal rank for graded relevance. In Proceedings of the 18th ACM Conference on Information and Knowledge Management, CIKM ’09 (pp. 621–630). New York: ACM.  https://doi.org/10.1145/1645953.1646033.
  9. Clarke, C.L., Kolla, M., Cormack, G.V., Vechtomova, O., Ashkan, A., Büttcher, S., MacKinnon, I. (2008). Novelty and diversity in information retrieval evaluation. In Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ’08 (pp. 659–666). New York: ACM.  https://doi.org/10.1145/1390334.1390446.
  10. Dang, V., & Croft, B.W. (2013). Term level search result diversification. In Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ’13 (pp. 603–612). New York: ACM.  https://doi.org/10.1145/2484028.2484095.
  11. Daumé, H.III, & Brill, E. (2004). Web search intent induction via automatic query reformulation. In Proceedings of HLT-NAACL 2004: Short Papers, HLT-NAACL-Short ’04 (pp. 49–52). Stroudsburg: Association for Computational Linguistics. http://dl.acm.org/citation.cfm?id=1613984.1613997.
  12. Gong, Z., Cheang, C.W., Leong Hou, U. (2005). Web query expansion by wordnet. In Andersen, K.V., Debenham, J., Wagner, R. (Eds.) Database and Expert Systems Applications (pp. 166–175). Berlin: Springer.Google Scholar
  13. Jansen, B.J., Spink, A., Saracevic, T. (2000). Real life, real users, and real needs: a study and analysis of user queries on the web. Information Processing and Management, 36(2), 207–227.  https://doi.org/10.1016/S0306-4573(99)00056-4.CrossRefGoogle Scholar
  14. Krishnan, A., Deepak, P., Ranu, S., Mehta, S. (2016). Select, link and rank: Diversified query expansion and entity ranking using wikipedia. In WISE.CrossRefGoogle Scholar
  15. Krishnan, A., Deepak, P., Ranu, S., Mehta, S. (2017). Leveraging semantic resources in diversified query expansion. World Wide Web, 21, 1041–1067.CrossRefGoogle Scholar
  16. Kurland, O., & Lee, L. (2006). Pagerank without hyperlinks: Structural re-ranking using links induced by language models. arXiv:abs/cs/0601045.
  17. Lawrie, D., Croft, W.B., Rosenberg, A. (2001). Finding topic words for hierarchical summarization. In Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ’01 (pp. 349–357). New York: ACM.  https://doi.org/10.1145/383952.384022.
  18. Lawrie, D.J., & Croft, W.B. (2003). Generating hierarchical summaries for web searches. In Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Informaion Retrieval, SIGIR ’03 (pp. 457–458). New York: ACM.  https://doi.org/10.1145/860435.860549.
  19. Liu, X., Bouchoucha, A., Sordoni, A., Nie, J.Y. (2014). Compact aspect embedding for diversified query expansions. In AAAI.Google Scholar
  20. Mei, Q., Guo, J., Radev, D. (2010). Divrank: the interplay of prestige and diversity in information networks. In Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’10 (pp. 1009–1018). New York: ACM.  https://doi.org/10.1145/1835804.1835931.
  21. Mikolov, T., Chen, K., Corrado, G., Dean, J. (2013). Efficient estimation of word representations in vector space. arXiv:abs/1301.3781.
  22. Newman, M.E.J. (2006). Finding community structure in networks using the eigenvectors of matrices. Physical Review E, 74(036), 104.MathSciNetGoogle Scholar
  23. Radlinski, F., & Dumais, S. (2006). Improving personalized web search using result diversification. In Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ’06 (pp. 691–692). New York: ACM, DOI  https://doi.org/10.1145/1148170.1148320, (to appear in print).
  24. Radlinski, F., Szummer, M., Craswell, N. (2010). Inferring query intent from reformulations and clicks. In Proceedings of the 19th International Conference on World Wide Web, WWW ’10 (pp. 1171–1172). New York: ACM.  https://doi.org/10.1145/1772690.1772859.
  25. Sakai, T., Craswell, N., Song, R., Robertson, S.E., Dou, Z., Lin, C.Y. (2010). Simple evaluation metrics for diversified search results. In EVIA@NTCIR.Google Scholar
  26. Santos, R.L., Macdonald, C., Ounis, I. (2010). Exploiting query reformulations for web search result diversification. In Proceedings of the 19th International Conference on World Wide Web, WWW ’10 (pp. 881–890). New York: ACM.  https://doi.org/10.1145/1772690.1772780.
  27. Santos, R.L.T., Macdonald, C., Ounis, I. (2013). Learning to rank query suggestions for adhoc and diversity search. Information Retrieval, 16(4), 429–451.  https://doi.org/10.1007/s10791-012-9211-2.CrossRefGoogle Scholar
  28. Santos, R.L.T., Macdonald, C., Ounis, I. (2015). Search result diversification. Foundations and Trends in Information Retrieval, 9(1), 1–90.  https://doi.org/10.1561/1500000040.CrossRefGoogle Scholar
  29. Shen, X., Tan, B., Zhai, C. (2005). Implicit user modeling for personalized search. In Proceedings of the 14th ACM International Conference on Information and Knowledge Management, CIKM ’05 (pp. 824–831). New York: AC.  https://doi.org/10.1145/1099554.1099747M.
  30. Sieg, A., Mobasher, B., Burke, R. (2007). Web search personalization with ontological user profiles. In Proceedings of the Sixteenth ACM Conference on Conference on Information and Knowledge Management, CIKM ’07 (pp. 525–534). New York: ACM.  https://doi.org/10.1145/1321440.1321515.
  31. Song, W., Liu, Y., Liu, L.Z., Wang, H.S. (2018). Semantic composition of distributed representations for query subtopic mining. Frontiers of Information Technology & Electronic Engineering, 19, 1409–1419.  https://doi.org/10.1631/FITEE.1601476.CrossRefGoogle Scholar
  32. Teevan, J., Dumais, S.T., Liebling, D.J. (2008). To personalize or not to personalize: Modeling queries with variation in user intent. In Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ’08 (pp. 163–170). New York: ACM.Google Scholar
  33. Vargas, S., Santos, R.L.T., Macdonald, C., Ounis, I. (2013). Selecting effective expansion terms for diversity. In Proceedings of the 10th Conference on Open Research Areas in Information Retrieval, OAIR ’13 (pp. 69–76). LE CENTRE DE HAUTES ETUDES INTERNATIONALES D’INFORMATIQUE DOCUMENTAIRE, Paris, France, France. http://dl.acm.org/citation.cfm?id=2491748.2491767.
  34. Vechtomova, O., Robertson, S., Jones, S. (2003). Query expansion with long-span collocates. Information Retrieval, 6(2), 251–273.CrossRefGoogle Scholar
  35. Wang, Q., Qian, Y., Song, R., Dou, Z., Zhang, F., Sakai, T., Zheng, Q. (2013). Mining subtopics from text fragments for a web query. Information Retrieval 16.  https://doi.org/10.1007/s10791-013-9221-8.CrossRefGoogle Scholar
  36. Xavier, S.F., Selvaraj, L.P., Balasubramanian, V. (2015). Enhancing statistical semantic networks with concept hierarchies. In 2015 International conference on advances in computing, communications and informatics (ICACCI) (pp. 1298–1307).Google Scholar
  37. Xu, J., & Croft, W.B. (1996). Query expansion using local and global document analysis. In Proceedings of the 19th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ’96 (pp. 4–11). New York: ACM.Google Scholar
  38. Yamamoto, T., Liu, Y., Zhang, M., Dou, Z., Zhou, K., Markov, I., Kato, M.P., Ohshima, H., Fujita, S. (2016). Overview of the NTCIR-12 imine-2 task. In: Proceedings of the 12th NTCIR Conference on Evaluation of Information Access Technologies, National Center of Sciences, Tokyo, Japan, June 7-10, 2016.Google Scholar
  39. Zhu, X., Goldberg, A., Gael, J.V., Andrzejewski, D. (2007). Improving diversity in ranking using absorbing random walks. HLT-NAACL pp. 97–104.Google Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2019

Authors and Affiliations

  1. 1.Department of Computer Science and Engineering, Amrita School of EngineeringAmrita Vishwa VidyapeethamCoimbatoreIndia

Personalised recommendations