Pseudo Topic Analysis for Boosting Pseudo Relevance Feedback

Yan, Rong; Gao, Guanglai

doi:10.1007/978-3-030-26072-9_26

Pseudo Topic Analysis for Boosting Pseudo Relevance Feedback

Rong Yan^14,15 &
Guanglai Gao^14,15

Conference paper
First Online: 18 July 2019

1316 Accesses
1 Citations

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11641))

Abstract

Traditional Pseudo Relevance Feedback (PRF) approaches fail to mode real-world intricate user activities. They naively assume that the first-pass top-ranked search results, i.e. the pseudo relevant set, have potentially relevant aspects for the user query. It is make the major challenge in PRF lies in how to get the reliability relevant feedback contents for the user real information need. Actually, there are two problems should not be ignored: (1) the assumed relevant documents are intertwined with the relevant and the non-relevant content, which influence the reliability of the expansion resource and can not concentrate in the real relevant portion; (2) even if the assumed relevant documents are real relevant to the user query, but they are always semantic redundance with various forms because the peculiarity of natural language expression. Furthermore, it will aggravate the ‘query drift’ problem. To alleviate these problems, in this paper, we propose a novel PRF approach by diversifying feedback source, which main aim is to converge the relatively single semantic as well as diversity relevant information from the pseudo relevant set. The key idea behind our PRF approach is to construct an abstract pseudo content obtained from topical networks modeling over the set of top-ranked documents to represent the feedback documents, so as to cover as diverse aspects of the feedback set as possible in a small semantic granularity. Experimental results conducted in real datasets indicate that the proposed strategies show great promise for searching more reliable feedback source by helping to achieve query and search result diversity without giving up precision.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

References

Abid, A., et al.: A survey on search results diversification techniques. Neural Comput. Appl. 27(5), 1207–1229 (2016)
Article Google Scholar
Agrawal, R., Gollapudi, S., Halverson, A., Ieong, S.: Diversifying search results. In: Proceedings of the 2nd ACM International Conference on Web Search and Data Mining, WSDM 2009, Barcelona, Spain , pp. 5–14, February 2009
Google Scholar
Blei, D.M., Lafferty, J.D.: Correlated topic models. In: Proceedings of the 18th International Conference on Neural Information Processing Systems, NIPS 2005, pp. 147–154. MIT Press, MA (2005)
Google Scholar
Blondel, V.D., Guillaume, J.L., Lambiotte, R., Lefebvre, E.: Fast unfolding of communities in large networks. J. Stat. Mech: Theory Exp. P1008, 155–168 (2008)
Google Scholar
Carbonell, J., Goldstein, J.: The use of MMR, diversity-based reranking for reordering documents and producing summaries. In: Proceedings of the 21st International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 1998, pp. 335–336. Melbourne, Australia, August 1998
Google Scholar
Carpineto, C., Romano, G.: A survey of automatic query expansion in information retrieval. ACM Comput. Surv. 44(1), 159–170 (2012)
Article Google Scholar
Chen, M., Jin, X.M., Shen, D.: Short text classification improved by learning multi-granularity topics. In: Proceedings of 22nd International Joint Conference on Artificial Intelligence, pp. 1776–1781 (2011)
Google Scholar
Fu, J.C., Wu, J.L., Liu, C.J., Xu, J.: Leaders in communities of real-world networks. Phys. A: Stat. Mech. Appl. 444, 428–441 (2016)
Article Google Scholar
Fu, J.C., Zhang, W.X., Wu, J.L.: Identification of leader and self-organizing communities in complex networks. Sci. Rep. 7(1), 1–10 (2017)
Article Google Scholar
Ganguly, D., Jones, J.F.G.: A non-parametric topical relevance model. Inf. Retr. J. 1–31 (2018)
Google Scholar
Han, X., et al.: Emergence of communities and diversity in social networks. Proc. Nat. Acad. Sci. 114(11), 2887 (2017)
Article Google Scholar
Jones, K.S., Walker, S., Robertson, S.E.: A probabilistic model of information retrieval: development and comparative experiments part 2. Inf. Process. Manag. 36(6), 809–840 (2000)
Article Google Scholar
Li, X.M., Ouyang, J.H., Lu, Y., Zhou, X.T., Tian, T.: Group topic model: organizing topics into groups. Inf. Retr. J. 18(1), 1–25 (2015)
Article Google Scholar
Liu, C.J.: Community ditection and analytical application in complex networks. Ph.D. thesis, Shandong University, Shandong, China (2014)
Google Scholar
Lv, L.Y., Zhou, T.: Link prediction in complex networks: a survey. Phys. A 390(6), 1150–1170 (2011)
Article Google Scholar
Miao, J., Huang, X., Zhao, J.S.: TopPRF: A probabilistic framework for integrating topic space into pseudo relevance feedback. ACM Trans. Inf. Syst. 34(4), 1–36 (2016)
Article Google Scholar
Santos, R.L.T., Macdonald, C., Ounis, I.: Search result diversification. Found. Trends Inf. Retr. 9(1), 1–90 (2015)
Article Google Scholar
Serizawa, M., Kobayashi, I.: A study on query expansion based on topic distributions of retrieved documents. In: Gelbukh, A. (ed.) CICLing 2013. LNCS, vol. 7817, pp. 369–379. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-37256-8_31
Chapter Google Scholar
Shen, H.W., Cheng, X.Q., Cai, K., Hu, M.B.: Detect overlapping and hierarchical community structure in networks. Phys. A 388(8), 1706–1712 (2009)
Article Google Scholar
Shen, X.H., Zhai, C.X.: Active feedback in ad hoc information retrieval. In: Proceedings of the 28th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2005, Salvador, Brazil, pp. 59–66, August 2005
Google Scholar
Smith, A., Chuang, J., Hu, Y.N., Boyd-Graber, J., Findlater, L.: Concurrent visualization of relationships between words and topics in topic models. In: Proceedings of the Workshop on Interactive Language Learning, Visualization, and Interfaces, ACL 2014, pp. 79–82. ACM Press, New York (2014)
Google Scholar
Stephen, R.: Okapi at TREC 3. In: Overview of the Third Text Retrieval Conference (TREC 3), pp. 109–125 (1994)
Google Scholar
Vargas, S., Santos, R.L.T., Macdonald, C., Ounis, I.: Selecting effective expansion terms for diversity. In: Proceedings of the 10th Conference on Open Research Areas in Information Retrieval, OAIR 2013, Lisbon, Portugal, pp. 69–76, May 2013
Google Scholar
Wang, X.W., Zhang, Q., Wang, X.J., Sun, Y.P.: LDA based pseudo relevance feedback for cross language information retrieval. In: IEEE International Conference on Cloud Computing and Intelligent Systems, CCIS 2012, vol. 3, pp. 1511–1516 (2012)
Google Scholar
Wei, F.R., et al.: TIARA: a visual exploratory text analytic system. In: Proceedings of the 16th ACM International Conference on Knowledge Discovery and Data Mining, SIGKDD 2010, Washington, DC, USA, pp. 168–168, July 2010
Google Scholar
Yan, R., Gao, G.L.: Pseudo-based relevance analysis for information retrieval. In: 2017 IEEE 29th International Conference on Tools with Artificial Intelligence, ICTAI 2017, Boston, MA, USA, pp. 1259–1266, November 2017
Google Scholar
Ye, Z., Huang, J.X., Lin, H.F.: Finding a good query-related topic for boosting pseudo-relevance feedback. J. Assoc. Inf. Sci. Technol. 62(4), 748–760 (2011)
Article Google Scholar

Download references

Acknowledgements

This research is jointly supported by the National Natural Science Foundation of China (Grant No. 61866029, 61763034), Natural Science Foundation of Inner Mongolia Autonomous Region (Grant No. 2018MS06025) and Program of Higher-Level Talents of Inner Mongolia University (Grant No. 21500-5175128).

Author information

Authors and Affiliations

College of Computer Science, Inner Mongolia University, Hohhot, People’s Republic of China
Rong Yan & Guanglai Gao
Inner Mongolia Key Laboratory of Mongolian, Information Processing Technology, Hohhot, People’s Republic of China
Rong Yan & Guanglai Gao

Authors

Rong Yan
View author publications
You can also search for this author in PubMed Google Scholar
Guanglai Gao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Rong Yan .

Editor information

Editors and Affiliations

University of Electronic Science and Technology of China, Chengdu, China
Jie Shao
Hong Kong Polytechnic University, Hong Kong, China
Man Lung Yiu
The University of Tokyo, Tokyo, Japan
Masashi Toyoda
Zhejiang University, Hangzhou, China
Dongxiang Zhang
National University of Singapore, Singapore, Singapore
Wei Wang
Peking University, Beijing, China
Bin Cui

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Yan, R., Gao, G. (2019). Pseudo Topic Analysis for Boosting Pseudo Relevance Feedback. In: Shao, J., Yiu, M., Toyoda, M., Zhang, D., Wang, W., Cui, B. (eds) Web and Big Data. APWeb-WAIM 2019. Lecture Notes in Computer Science(), vol 11641. Springer, Cham. https://doi.org/10.1007/978-3-030-26072-9_26

Download citation

DOI: https://doi.org/10.1007/978-3-030-26072-9_26
Published: 18 July 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-26071-2
Online ISBN: 978-3-030-26072-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics