An Effective Framework for Identifying Good XML Feedback Fragments
In Pseudo relevance feedback, It is often crucial to identify those good feedback documents from which useful expansion terms can be added to the query. For Extensible Markup Language (XML) data, this paper proposes an approach for identifying good feedback fragments by a complete framework in which two phrases are included. (1) The first phase is about XML search results clustering. We performed a k-medoid clustering algorithm to XML fragments based on an extended latent semantic indexing model. (2) The second phase is a two-stage ranking. Cluster ranking is performed in the first stage to select relevant clusters on the basis of cluster labelling, which is determined by extracted fragment keywords based on a combination of weight and context; fragment ranking is performed during the second stage where multiple features are used to identify high quality fragments from the previously obtained relevant clusters. Experimental results on standard INEX test data show that the proposed approach achieves statistically significant improvements over a strong original query results mechanism, ensuring high quality fragments for feedback.
KeywordsXML fragment Clustering search results Two-stage ranking model Pseudo relevance feedback
This material is based upon work supported by the postdoctoral fund of china (2017M612602) and south central university, Humanities and Social Science Research Project of Jiangxi Province (TQ1504) and National Science Foundation of China under Grant Numbers 71762017, 61662027, 61363010 and 71361012.
- 1.Ben, H., Ladh, O.: Finding good feedback documents. In: Proceedings of the 18th ACM Conference on Information and Knowledge Management (CIKM), 2–6 November, Hong Kong, China (2009)Google Scholar
- 2.Karthik, R., Raghavendra, U., Pushpak, B., et al.: On improving pseudo-relevance feedback using pseudo irrelevant documents. In: Proceedings of the 32nd European Conference on Information Retrieval (ECIR), 28–31 March, Milton Keynes, UK (2010)Google Scholar
- 7.Lv, Y., Zhai, C., Chen, W.: A boosting approach to improving pseudo-relevance feedback. In: Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval, 24–28 July, Beijing, China (2011)Google Scholar
- 9.Kevyn, C.T., Jamie, C.: Estimation and use of uncertainty in pseudo-relevance feedback. In: Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 23–27 July, Amsterdam (2007)Google Scholar
- 10.Zhong, M.: Combining term semantics with content and structure semantics for XML element search results clustering. J. Converg. Inf. Technol. 7(15), 26–35 (2012)Google Scholar
- 11.Zhong, M., Wan, C., Liu, D., Liao, S., Luo, S.: Cluster labeling extraction and ranking feature selection for high quality XML pseudo relevance feedback fragments set. In: Proceedings of 9th International Conference on Advanced Data Mining and Applications, 14–16 December, Hangzhou, China (2013)Google Scholar