Abstract
Traditional Pseudo Relevance Feedback (PRF) approaches fail to mode real-world intricate user activities. They naively assume that the first-pass top-ranked search results, i.e. the pseudo relevant set, have potentially relevant aspects for the user query. It is make the major challenge in PRF lies in how to get the reliability relevant feedback contents for the user real information need. Actually, there are two problems should not be ignored: (1) the assumed relevant documents are intertwined with the relevant and the non-relevant content, which influence the reliability of the expansion resource and can not concentrate in the real relevant portion; (2) even if the assumed relevant documents are real relevant to the user query, but they are always semantic redundance with various forms because the peculiarity of natural language expression. Furthermore, it will aggravate the ‘query drift’ problem. To alleviate these problems, in this paper, we propose a novel PRF approach by diversifying feedback source, which main aim is to converge the relatively single semantic as well as diversity relevant information from the pseudo relevant set. The key idea behind our PRF approach is to construct an abstract pseudo content obtained from topical networks modeling over the set of top-ranked documents to represent the feedback documents, so as to cover as diverse aspects of the feedback set as possible in a small semantic granularity. Experimental results conducted in real datasets indicate that the proposed strategies show great promise for searching more reliable feedback source by helping to achieve query and search result diversity without giving up precision.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Abid, A., et al.: A survey on search results diversification techniques. Neural Comput. Appl. 27(5), 1207–1229 (2016)
Agrawal, R., Gollapudi, S., Halverson, A., Ieong, S.: Diversifying search results. In: Proceedings of the 2nd ACM International Conference on Web Search and Data Mining, WSDM 2009, Barcelona, Spain , pp. 5–14, February 2009
Blei, D.M., Lafferty, J.D.: Correlated topic models. In: Proceedings of the 18th International Conference on Neural Information Processing Systems, NIPS 2005, pp. 147–154. MIT Press, MA (2005)
Blondel, V.D., Guillaume, J.L., Lambiotte, R., Lefebvre, E.: Fast unfolding of communities in large networks. J. Stat. Mech: Theory Exp. P1008, 155–168 (2008)
Carbonell, J., Goldstein, J.: The use of MMR, diversity-based reranking for reordering documents and producing summaries. In: Proceedings of the 21st International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 1998, pp. 335–336. Melbourne, Australia, August 1998
Carpineto, C., Romano, G.: A survey of automatic query expansion in information retrieval. ACM Comput. Surv. 44(1), 159–170 (2012)
Chen, M., Jin, X.M., Shen, D.: Short text classification improved by learning multi-granularity topics. In: Proceedings of 22nd International Joint Conference on Artificial Intelligence, pp. 1776–1781 (2011)
Fu, J.C., Wu, J.L., Liu, C.J., Xu, J.: Leaders in communities of real-world networks. Phys. A: Stat. Mech. Appl. 444, 428–441 (2016)
Fu, J.C., Zhang, W.X., Wu, J.L.: Identification of leader and self-organizing communities in complex networks. Sci. Rep. 7(1), 1–10 (2017)
Ganguly, D., Jones, J.F.G.: A non-parametric topical relevance model. Inf. Retr. J. 1–31 (2018)
Han, X., et al.: Emergence of communities and diversity in social networks. Proc. Nat. Acad. Sci. 114(11), 2887 (2017)
Jones, K.S., Walker, S., Robertson, S.E.: A probabilistic model of information retrieval: development and comparative experiments part 2. Inf. Process. Manag. 36(6), 809–840 (2000)
Li, X.M., Ouyang, J.H., Lu, Y., Zhou, X.T., Tian, T.: Group topic model: organizing topics into groups. Inf. Retr. J. 18(1), 1–25 (2015)
Liu, C.J.: Community ditection and analytical application in complex networks. Ph.D. thesis, Shandong University, Shandong, China (2014)
Lv, L.Y., Zhou, T.: Link prediction in complex networks: a survey. Phys. A 390(6), 1150–1170 (2011)
Miao, J., Huang, X., Zhao, J.S.: TopPRF: A probabilistic framework for integrating topic space into pseudo relevance feedback. ACM Trans. Inf. Syst. 34(4), 1–36 (2016)
Santos, R.L.T., Macdonald, C., Ounis, I.: Search result diversification. Found. Trends Inf. Retr. 9(1), 1–90 (2015)
Serizawa, M., Kobayashi, I.: A study on query expansion based on topic distributions of retrieved documents. In: Gelbukh, A. (ed.) CICLing 2013. LNCS, vol. 7817, pp. 369–379. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-37256-8_31
Shen, H.W., Cheng, X.Q., Cai, K., Hu, M.B.: Detect overlapping and hierarchical community structure in networks. Phys. A 388(8), 1706–1712 (2009)
Shen, X.H., Zhai, C.X.: Active feedback in ad hoc information retrieval. In: Proceedings of the 28th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2005, Salvador, Brazil, pp. 59–66, August 2005
Smith, A., Chuang, J., Hu, Y.N., Boyd-Graber, J., Findlater, L.: Concurrent visualization of relationships between words and topics in topic models. In: Proceedings of the Workshop on Interactive Language Learning, Visualization, and Interfaces, ACL 2014, pp. 79–82. ACM Press, New York (2014)
Stephen, R.: Okapi at TREC 3. In: Overview of the Third Text Retrieval Conference (TREC 3), pp. 109–125 (1994)
Vargas, S., Santos, R.L.T., Macdonald, C., Ounis, I.: Selecting effective expansion terms for diversity. In: Proceedings of the 10th Conference on Open Research Areas in Information Retrieval, OAIR 2013, Lisbon, Portugal, pp. 69–76, May 2013
Wang, X.W., Zhang, Q., Wang, X.J., Sun, Y.P.: LDA based pseudo relevance feedback for cross language information retrieval. In: IEEE International Conference on Cloud Computing and Intelligent Systems, CCIS 2012, vol. 3, pp. 1511–1516 (2012)
Wei, F.R., et al.: TIARA: a visual exploratory text analytic system. In: Proceedings of the 16th ACM International Conference on Knowledge Discovery and Data Mining, SIGKDD 2010, Washington, DC, USA, pp. 168–168, July 2010
Yan, R., Gao, G.L.: Pseudo-based relevance analysis for information retrieval. In: 2017 IEEE 29th International Conference on Tools with Artificial Intelligence, ICTAI 2017, Boston, MA, USA, pp. 1259–1266, November 2017
Ye, Z., Huang, J.X., Lin, H.F.: Finding a good query-related topic for boosting pseudo-relevance feedback. J. Assoc. Inf. Sci. Technol. 62(4), 748–760 (2011)
Acknowledgements
This research is jointly supported by the National Natural Science Foundation of China (Grant No. 61866029, 61763034), Natural Science Foundation of Inner Mongolia Autonomous Region (Grant No. 2018MS06025) and Program of Higher-Level Talents of Inner Mongolia University (Grant No. 21500-5175128).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Yan, R., Gao, G. (2019). Pseudo Topic Analysis for Boosting Pseudo Relevance Feedback. In: Shao, J., Yiu, M., Toyoda, M., Zhang, D., Wang, W., Cui, B. (eds) Web and Big Data. APWeb-WAIM 2019. Lecture Notes in Computer Science(), vol 11641. Springer, Cham. https://doi.org/10.1007/978-3-030-26072-9_26
Download citation
DOI: https://doi.org/10.1007/978-3-030-26072-9_26
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-26071-2
Online ISBN: 978-3-030-26072-9
eBook Packages: Computer ScienceComputer Science (R0)