Abstract
Recently, some studies have utilized the Markov Decision Process for diversifying (MDP-DIV) the search results in information retrieval. Though promising performances can be delivered, MDP-DIV suffers from a very slow convergence, which hinders its usability in real applications. In this paper, we aim to promote the performance of MDP-DIV by speeding up the convergence rate without much accuracy sacrifice. The slow convergence is incurred by two main reasons: the large action space and data scarcity. On the one hand, the sequential decision making at each position needs to evaluate the query-document relevance for all the candidate set, which results in a huge searching space for MDP; on the other hand, due to the data scarcity, the agent has to proceed more “trial and error” interactions with the environment. To tackle this problem, we propose MDP-DIV-kNN and MDP-DIV-NTN methods. The MDP-DIV-kNN method adopts a k nearest neighbor strategy, i.e., discarding the k nearest neighbors of the recently-selected action (document), to reduce the diversification searching space. The MDP-DIV-NTN employs a pre-trained diversification neural tensor network (NTN-DIV) as the evaluation model, and combines the results with MDP to produce the final ranking solution. The experiment results demonstrate that the two proposed methods indeed accelerate the convergence rate of the MDP-DIV, which is 3x faster, while the accuracies produced barely degrade, or even are better.
The work is done when Feng Liu works as an intern in Noah’s Ark Lab, Huawei.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
For the ease of explaination, we suppose each query is associated with the same number of documents.
- 2.
In order to learn end-to-end, we use the embedding features instead of handcrafted relevance features.
- 3.
All the queries and documents are embedded with doc2vec [8] embedding model.
- 4.
- 5.
The datasets and source code are available at https://github.com/sweetalyssum/RL4SRD.
- 6.
- 7.
- 8.
The authors does not provide the split results, therefore we re-split the queries.
References
Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., Corrado, G.S., Davis, A., Dean, J., Devin, M., et al.: Tensorflow: large-scale machine learning on heterogeneous distributed systems. arXiv preprint arXiv:1603.04467 (2016)
Carbonell, J., Goldstein, J.: The use of MMR, diversity-based reranking for reordering documents and producing summaries. In: SIGIR 1998, pp. 335–336. ACM (1998)
Chapelle, O., Ji, S., Liao, C., Velipasaoglu, E., Lai, L., Wu, S.L.: Intent-based diversification of web search results: metrics and algorithms. Inf. Retr. 14(6), 572–592 (2011)
Clarke, C.L., Kolla, M., Cormack, G.V., Vechtomova, O., Ashkan, A., Büttcher, S., MacKinnon, I.: Novelty and diversity in information retrieval evaluation. In: SIGIR 2008, pp. 659–666. ACM (2008)
Dang, V., Croft, W.B.: Diversity by proportionality: an election-based approach to search result diversification. In: SIGIR 2012, pp. 65–74. ACM (2012)
Guo, S., Sanner, S.: Probabilistic latent maximal marginal relevance. In: SIGIR 2010, pp. 833–834. ACM (2010)
He, J., Hollink, V., de Vries, A.: Combining implicit and explicit topic representations for result diversification. In: SIGIR 2012, pp. 851–860. ACM (2012)
Le, Q., Mikolov, T.: Distributed representations of sentences and documents. In: ICML 2014, vol. 32, pp. 1188–1196. PMLR, Beijing, 22–24 June 2014
Lu, Z., Yang, Q.: Partially observable Markov decision process for recommender systems. arXiv preprint arXiv:1608.07793 (2016)
Luo, J., Zhang, S., Yang, H.: Win-win search: dual-agent stochastic game in session search. In: SIGIR 2014, pp. 587–596. ACM (2014)
Puterman, M.L.: Markov Decision Processes: Discrete Stochastic Dynamic Programming. Wiley, Hoboken (2014)
Radlinski, F., Kleinberg, R., Joachims, T.: Learning diverse rankings with multi-armed bandits. In: ICML 2008, pp. 784–791. ACM (2008)
Rafiei, D., Bharat, K., Shukla, A.: Diversifying web search results. In: WWW 2010, pp. 781–790. ACM (2010)
Raman, K., Shivaswamy, P., Joachims, T.: Online learning to diversify from implicit feedback. In: SIGKDD 2012, pp. 705–713. ACM (2012)
Santos, R.L., Macdonald, C., Ounis, I.: Exploiting query reformulations for web search result diversification. In: WWW 2010, pp. 881–890. ACM (2010)
Santos, R.L.T., Peng, J., Macdonald, C., Ounis, I.: Explicit search result diversification through sub-queries. In: Gurrin, C., He, Y., Kazai, G., Kruschwitz, U., Little, S., Roelleke, T., Rüger, S., van Rijsbergen, K. (eds.) ECIR 2010. LNCS, vol. 5993, pp. 87–99. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-12275-0_11
Shani, G., Heckerman, D., Brafman, R.I.: An MDP-based recommender system. J. Mach. Learn. Res. 6(Sep), 1265–1295 (2005)
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction, vol. 1. MIT Press, Cambridge (1998)
Wang, J., Yu, L., Zhang, W., Gong, Y., Xu, Y., Wang, B., Zhang, P., Zhang, D.: IRGAN: a minimax game for unifying generative and discriminative information retrieval models. In: SIGIR 2017, pp. 515–524. ACM, New York (2017)
Wei, Z., Xu, J., Lan, Y., Guo, J., Cheng, X.: Reinforcement learning to rank with Markov decision process. In: SIGIR 2017, pp. 945–948. ACM, New York (2017)
Xia, L., Xu, J., Lan, Y., Guo, J., Cheng, X.: Learning maximal marginal relevance model via directly optimizing diversity evaluation measures. In: SIGIR 2015, pp. 113–122. ACM (2015)
Xia, L., Xu, J., Lan, Y., Guo, J., Cheng, X.: Modeling document novelty with neural tensor network for search result diversification. In: SIGIR 2016, pp. 395–404. ACM (2016)
Xia, L., Xu, J., Lan, Y., Guo, J., Zeng, W., Cheng, X.: Adapting Markov decision process for search result diversification. In: SIGIR 2017, pp. 535–544. ACM, New York (2017)
Xu, J., Xia, L., Lan, Y., Guo, J., Cheng, X.: Directly optimize diversity evaluation measures: a new approach to search result diversification. ACM Trans. Intell. Syst. Technol. (TIST) 8(3), 41 (2017)
Yu, H.T., Jatowt, A., Blanco, R., Joho, H., Jose, J., Chen, L., Yuan, F.: A concise integer linear programming formulation for implicit search result diversification. In: Proceedings of the Tenth ACM International Conference on Web Search and Data Mining, pp. 191–200. ACM (2017)
Zhai, C.X., Cohen, W.W., Lafferty, J.: Beyond independent relevance: methods and evaluation metrics for subtopic retrieval. In: SIGIR 2003, pp. 10–17. ACM (2003)
Zhang, S., Luo, J., Yang, H.: A POMDP model for content-free document re-ranking. In: SIGIR 2014, pp. 1139–1142. ACM (2014)
Zhu, Y., Lan, Y., Guo, J., Cheng, X., Niu, S.: Learning for search result diversification. In: SIGIR 2014, pp. 293–302. ACM (2014)
Acknowledgement
This research was supported in part by Shenzhen Science and Technology Program under Grant No. JCYJ20160330163900579, and NSFC under Grant Nos. 61572158 and 61602132.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Liu, F., Tang, R., Li, X., Ye, Y., Guo, H., He, X. (2018). Novel Approaches to Accelerating the Convergence Rate of Markov Decision Process for Search Result Diversification. In: Pei, J., Manolopoulos, Y., Sadiq, S., Li, J. (eds) Database Systems for Advanced Applications. DASFAA 2018. Lecture Notes in Computer Science(), vol 10828. Springer, Cham. https://doi.org/10.1007/978-3-319-91458-9_11
Download citation
DOI: https://doi.org/10.1007/978-3-319-91458-9_11
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-91457-2
Online ISBN: 978-3-319-91458-9
eBook Packages: Computer ScienceComputer Science (R0)