Advertisement

Local and Global Query Expansion for Hierarchical Complex Topics

  • Jeffrey DaltonEmail author
  • Shahrzad Naseri
  • Laura Dietz
  • James Allan
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11437)

Abstract

In this work we study local and global methods for query expansion for multifaceted complex topics. We study word-based and entity-based expansion methods and extend these approaches to complex topics using fine-grained expansion on different elements of the hierarchical query structure. For a source of hierarchical complex topics we use the TREC Complex Answer Retrieval (CAR) benchmark data collection. We find that leveraging the hierarchical topic structure is needed for both local and global expansion methods to be effective. Further, the results demonstrate that entity-based expansion methods show significant gains over word-based models alone, with local feedback providing the largest improvement. The results on the CAR paragraph retrieval task demonstrate that expansion models that incorporate both the hierarchical query structure and entity-based expansion result in a greater than 20% improvement over word-based expansion approaches.

Notes

Acknowledgments

This work was supported in part by the Center for Intelligent Information Retrieval and in part by NSF grant #IIS-1617408. Any opinions, findings and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect those of the sponsors.

References

  1. 1.
    Cao, G., Nie, J.Y., Gao, J., Robertson, S.: Selecting good expansion terms for pseudo-relevance feedback. In: Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval SIGIR 2008, pp. 243–250. ACM, New York (2008).  https://doi.org/10.1145/1390334.1390377
  2. 2.
    Cohen, D., Croft, W.B.: End to end long short term memory networks for non-factoid question answering. In: Proceedings of the 2016 ACM International Conference on the Theory of Information Retrieval, pp. 143–146. ACM, September 2016Google Scholar
  3. 3.
    Cohen, D., Yang, L., Croft, W.B.: WikiPassageQA: A benchmark collection for research on non-factoid answer passage retrieval. In: The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval SIGIR 2018, Ann Arbor, MI, USA, pp. 1165–1168, 08–12 July 2018.  https://doi.org/10.1145/3209978.3210118
  4. 4.
    Dalton, J., Dietz, L., Allan, J.: Entity query feature expansion using knowledge base links. In: Proceedings of the 37th International ACM SIGIR Conference on Research & Development in Information Retrieval SIGIR 2014, pp. 365–374. ACM, New York (2014)Google Scholar
  5. 5.
    Dietz, L., Gamari, B., Dalton, J.: TREC CAR 2.1: A data set for complex answer retrieval (2018). http://trec-car.cs.unh.edu
  6. 6.
    Dietz, L., Kotov, A., Meij, E.: Utilizing knowledge graphs for text-centric information retrieval. In: The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval, pp. 1387–1390. ACM (2018)Google Scholar
  7. 7.
    Dietz, L., Verma, M., Radlinski, F., Craswell, N.: TREC complex answer retrieval overview. In: Proceedings of The Twenty-Sixth Text Retrieval Conference TREC 2017, Gaithersburg, Maryland, USA, 15–17 November 2017. https://trec.nist.gov/pubs/trec26/papers/Overview-CAR.pdf
  8. 8.
    Hasibi, F., Balog, K., Bratsberg, S.E.: Exploiting entity linking in queries for entity retrieval. In: Proceedings of the 2016 ACM International Conference on the Theory of Information Retrieval ICTIR 2016, pp. 209–218. ACM, New York (2016)Google Scholar
  9. 9.
    Lavrenko, V., Croft, W.B.: Relevance based language models. In: Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval SIGIR 2001, pp. 120–127. ACM, New York (2001).  https://doi.org/10.1145/383952.383972
  10. 10.
    Liu, X., Fang, H.: Latent entity space: A novel retrieval approach for entity-bearing queries. Inf. Retr. J. 18(6), 473–503 (2015).  https://doi.org/10.1007/s10791-015-9267-xMathSciNetCrossRefGoogle Scholar
  11. 11.
    MacAvaney, S., et al.: Characterizing question facets for complex answer retrieval. In: The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval, pp. 1205–1208. ACM, June 2018Google Scholar
  12. 12.
    Metzler, D., Diaz, F., Strohman, T., Croft, W.B.: UMass robust 2005: Using mixtures of relevance models for query expansion. In: Proceedings of the Fourteenth Text Retrieval Conference TREC 2005, Gaithersburg, Maryland, USA, 15–18 November 2005Google Scholar
  13. 13.
    Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)Google Scholar
  14. 14.
    Mitra, B., Diaz, F., Craswell, N.: Learning to match using local and distributed representations of text for web search. In: Proceedings of the 26th International Conference on World Wide Web, WWW 2017 International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, Switzerland, pp. 1291–1299 (2017).  https://doi.org/10.1145/3038912.3052579
  15. 15.
    Nanni, F., Mitra, B., Magnusson, M., Dietz, L.: Benchmark for complex answer retrieval. In: Proceedings of the ACM SIGIR International Conference on Theory of Information Retrieval ICTIR 2017, pp. 293–296. ACM, New York (2017)Google Scholar
  16. 16.
    Ni, Y., et al.: Semantic documents relatedness using concept graph representation. In: Proceedings of the Ninth ACM International Conference on Web Search and Data Mining, pp. 635–644. ACM (2016)Google Scholar
  17. 17.
    Ponte, J.M., Croft, W.B.: A language modeling approach to information retrieval. In: Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval SIGIR 1998, pp. 293–296. ACM, New York (1998)Google Scholar
  18. 18.
    Raviv, H., Kurland, O., Carmel, D.: Document retrieval using entity-based language models. In: Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval SIGIR 2016, pp. 65–74. ACM, New York (2016)Google Scholar
  19. 19.
    Řehůřek, R., Sojka, P.: Software framework for topic modelling with large corpora. In: Proceedings of the LREC 2010 Workshop on New Challenges for NLP Frameworks, pp. 45–50. ELRA, Valletta, Malta, May 2010. http://is.muni.cz/publication/884893/en
  20. 20.
    Salton, G., Fox, E.A., Wu, H.: Extended boolean information retrieval. Commun. ACM 26(11), 1022–1036 (1983)MathSciNetCrossRefGoogle Scholar
  21. 21.
    Turtle, H., Croft, W.B.: Evaluation of an inference network-based retrieval model. ACM Trans. Inf. Syst. Secur. 9(3), 187–222 (1991)CrossRefGoogle Scholar
  22. 22.
    Xiong, C., Callan, J.: EsdRank: Connecting query and documents through external semi-structured data. In: Proceedings of the 24th ACM International Conference on Information and Knowledge Management CIKM 2015, pp. 951–960. ACM, New York (2015)Google Scholar
  23. 23.
    Xiong, C., Callan, J., Liu, T.Y.: Word-entity duet representations for document ranking. In: Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 763–772. ACM (2017)Google Scholar
  24. 24.
    Xiong, C., Liu, Z., Callan, J., Liu, T.Y.: Towards better text understanding and retrieval through kernel entity salience modeling. In: The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval, pp. 575–584. ACM, June 2018Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Jeffrey Dalton
    • 1
    Email author
  • Shahrzad Naseri
    • 2
  • Laura Dietz
    • 3
  • James Allan
    • 2
  1. 1.School of Computing ScienceUniversity of GlasgowGlasgowUK
  2. 2.Center for Intelligent Information Retrieval, College of Information and Computer SciencesUniversity of Massachusetts AmherstAmherstUSA
  3. 3.Department of Computer ScienceUniversity of New HampshireDurhamUSA

Personalised recommendations