Advertisement

Information Retrieval

, Volume 14, Issue 6, pp 572–592 | Cite as

Intent-based diversification of web search results: metrics and algorithms

  • Olivier Chapelle
  • Shihao Ji
  • Ciya Liao
  • Emre Velipasaoglu
  • Larry Lai
  • Su-Lin Wu
Article

Abstract

We study the problem of web search result diversification in the case where intent based relevance scores are available. A diversified search result will hopefully satisfy the information need of user-L.s who may have different intents. In this context, we first analyze the properties of an intent-based metric, ERR-IA, to measure relevance and diversity altogether. We argue that this is a better metric than some previously proposed intent aware metrics and show that it has a better correlation with abandonment rate. We then propose an algorithm to rerank web search results based on optimizing an objective function corresponding to this metric and evaluate it on shopping related queries.

Keywords

Web search Ranking Relevance Diversification 

References

  1. Agrawal, R., Gollapudi, S., Halverson, A., & Ieong, S. (2009). Diversifying search results. In WSDM ’09: Proceedings of the 2nd international conference on web search and web data mining (pp. 5–14). ACM.Google Scholar
  2. Barlow, R. E., Brunk, H. D., Bartholomew, D. J., & Bremner, J. M. (1972). Statistical inference under order restrictions: The theory and application of isotonic regression. New York: Wiley.MATHGoogle Scholar
  3. Burges, C. J., Shaked, T., Renshaw, E., Lazier, A., Deeds, M., Hamilton, N., et al. (2005). Learning to rank using gradient descent. In Proceedings of the international conference on machine learning.Google Scholar
  4. Carbonell, J., & Goldstein, J. (1998). The use of MMR, diversity-based reranking for reordering documents and producing summaries. In Proceedings of the 21st annual international ACM SIGIR conference on research and development in information retrieval (pp. 335–336).Google Scholar
  5. Carterette, B., & Chandar, P. (2009). Probabilistic models of ranking novel documents for faceted topic retrieval. In CIKM ’09: Proceeding of the 18th ACM conference on information and knowledge management.Google Scholar
  6. Chapelle, O., Metlzer, D., Zhang, Y., & Grinspan, P. (2009). Expected reciprocal rank for graded relevance. In CIKM ’09: Proceedings of the 18th ACM conference on information and knowledge management.Google Scholar
  7. Chen, H., & Karger, D. R. (2006) Less is more: Probabilistic models for retrieving fewer relevant documents. In Proceedings of the 29th annual international ACM SIGIR conference on research and development in information retrieval (pp. 429–436).Google Scholar
  8. Clarke, C., Kolla, M., Cormack, G., Vechtomova, O., Ashkan, A., & Büttcher, S., et al. (2008). Novelty and diversity in information retrieval evaluation. In SIGIR ’08: Proceedings of the 31st annual international conference on research and development in information retrieval (pp. 659–666). ACM.Google Scholar
  9. Clarke, C. L., Kolla, M., & Vechtomova, O. (2009). An effectiveness measure for ambiguous and underspecified queries. In Proceedings of the 2nd international conference on theory of information retrieval: advances in information retrieval theory (pp. 188–199).Google Scholar
  10. Claypool, M., Le, P., Waseda, M., & Brown, D. (2001). Information filtering based on user behavior analysis and best match text retrieval. In Proceedings of ACM intelligent user interfaces conference (IUI) (pp. 33–40).Google Scholar
  11. Craswell, N., Zoeter, O., Taylor, M., & Ramsey, B. (2008). An experimental comparison of click position-bias models. In WSDM ’08: proceedings of the international conference on web search and web data mining (pp. 87–94). ACM.Google Scholar
  12. Friedman, J. H. (2001). Greedy function approximation: A gradient boosting machine. Annals of Statistics, 29(5), 1189–1232.Google Scholar
  13. Gollapudi, S., & Sharma, A. (2009). An axiomatic approach for result diversification. In Proceedings of the 18th international conference on World Wide Web (pp. 381–390).Google Scholar
  14. Jarvelin, K., & Kekalainen, J. (2002). Cumulated gain-based evaluation of IR techniques. ACM Transactions on Information Systems, 20(4), 422–446.CrossRefGoogle Scholar
  15. Liu, C., White, R. W., & Dumais, S. (2006). Understanding web browsing behaviors through weibull analysis of dwell time. In Proceedings of the 33rd annual international ACM SIGIR conference on research and development in information retrieval.Google Scholar
  16. Papadimitriou, C. H., & Steiglitz, K. (1998). Combinatorial optimization: Algorithms and complexity. Mineola, NY: Dover.MATHGoogle Scholar
  17. Paranjpe, D. (2009). Learning document aboutness from implicit user feedback and document structure. In CIKM ’09: Proceeding of the 18th ACM conference on information and knowledge management (pp. 365–374). ACM.Google Scholar
  18. Radlinski, F., Kleinberg, R., & Joachims, T. (2008). Learning diverse rankings with multi-armed bandits. In Proceedings of the 25th international conference on machine learning (pp. 784–791).Google Scholar
  19. Rafiei, D., Bharat, K., & Shukla, A. (2010). Diversifying web search results. In Proceedings of the 19th international conference on World Wide Web (pp. 781–790).Google Scholar
  20. Santos, R. L. T., Macdonald, C., & Ounis, I. (2010a). Exploiting query reformulations for Web search result diversification. In Proceedings of the 19th international conference on World Wide Web (pp. 881–890).Google Scholar
  21. Santos, R. L. T., Peng, J., Macdonald, C., & Ounis, I. (2010b). Explicit search result diversification through sub-queries. In Proceedings of the 31st European conference on information retrieval (pp. 87–99).Google Scholar
  22. Sarma, A. D., Gollapudi, S., & Leong, S. (2008). Bypass rates: Reducing query abandonment using negative inferences. In Proceeding of the 14th ACM SIGKDD international conference on knowledge discovery and data mining (KDD) (pp. 177–185).Google Scholar
  23. Simchi-Levi, D., Chen, X., & Bramel, J. (2005). The logic of logistics: theory, algorithms, and applications for logistics and supply chain management. Berlin: Springer.MATHGoogle Scholar
  24. Wang, J., & Zhu, J. (2009). Portfolio theory of information retrieval. In Proceedings of the 32th annual international ACM SIGIR conference on research and development in informaion retrieval (pp. 115–122).Google Scholar
  25. Zhai, C. X., Cohen, W. W., & Lafferty, J. (2003). Beyond independent relevance: Methods and evaluation metrics for subtopic retrieval. In Proceedings of the 26th snnual international ACM SIGIR conference on research and development in informaion retrieval (pp. 10–17).Google Scholar
  26. Zheng, Z., Zha, H., Zhang, T., Chapelle, O., Chen, K., & Sun, G. (2008). A general boosting method and its application to learning ranking functions for web search. In J. C. Platt, D. Koller, Y. Singer, & S. Roweis (Eds.), Advances in Neural Information Processing Systems, 20, 1697–1704.Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2011

Authors and Affiliations

  • Olivier Chapelle
    • 1
  • Shihao Ji
    • 2
  • Ciya Liao
    • 3
  • Emre Velipasaoglu
    • 1
  • Larry Lai
    • 1
  • Su-Lin Wu
    • 1
  1. 1.Yahoo! LabsSunnyvaleUSA
  2. 2.Microsoft BingBellevueUSA
  3. 3.Microsoft BingMountain ViewUSA

Personalised recommendations