Advertisement

An Axiomatic Approach to Diagnosing Neural IR Models

  • Daniël Rennings
  • Felipe Moraes
  • Claudia HauffEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11437)

Abstract

Traditional retrieval models such as BM25 or language models have been engineered based on search heuristics that later have been formalized into axioms. The axiomatic approach to information retrieval (IR) has shown that the effectiveness of a retrieval method is connected to its fulfillment of axioms. This approach enabled researchers to identify shortcomings in existing approaches and “fix” them. With the new wave of neural net based approaches to IR, a theoretical analysis of those retrieval models is no longer feasible, as they potentially contain millions of parameters. In this paper, we propose a pipeline to create diagnostic datasets for IR, each engineered to fulfill one axiom. We execute our pipeline on the recently released large-scale question answering dataset WikiPassageQA (which contains over 4000 topics) and create diagnostic datasets for four axioms. We empirically validate to what extent well-known deep IR models are able to realize the axiomatic pattern underlying the datasets. Our evaluation shows that there is indeed a positive relation between the performance of neural approaches on diagnostic datasets and their retrieval effectiveness. Based on these findings, we argue that diagnostic datasets grounded in axioms are a good approach to diagnosing neural IR models.

Notes

Acknowledgements

This work was funded by NWO projects LACrOSSE (612.001.605) and SearchX (639.022.722) and Deloitte NL.

References

  1. 1.
    Ariannezhad, M., Montazeralghaem, A., Zamani, H., Shakery, A.: Improving retrieval performance for verbose queries via axiomatic analysis of term discrimination heuristic. In: SIGIR 2017, pp. 1201–1204 (2017)Google Scholar
  2. 2.
    Chen, H., et al.: MIX: multi-channel information crossing for text matching. In: KDD 2018, pp. 110–119 (2018)Google Scholar
  3. 3.
    Clinchant, S., Gaussier, E.: Is document frequency important for PRF? In: Amati, G., Crestani, F. (eds.) ICTIR 2011. LNCS, vol. 6931, pp. 89–100. Springer, Heidelberg (2011).  https://doi.org/10.1007/978-3-642-23318-0_10CrossRefGoogle Scholar
  4. 4.
    Clinchant, S., Gaussier, E.: A theoretical analysis of pseudo-relevance feedback models. In: ICTIR 2013, pp. 6–13 (2013)Google Scholar
  5. 5.
    Cohen, D., O’Connor, B., Croft, W.B.: Understanding the representational power of neural retrieval models using NLP tasks. In: ICTIR 2018, pp. 67–74 (2018)Google Scholar
  6. 6.
    Cohen, D., Yang, L., Croft, W.B.: WikiPassageQA: a benchmark collection for research on non-factoid answer passage retrieval. In: SIGIR 2018, pp. 1165–1168 (2018)Google Scholar
  7. 7.
    Craswell, N., Croft, W.B., de Rijke, M., Guo, J., Mitra, B.: SIGIR 2017 workshop on neural information retrieval. In: SIGIR 2017, pp. 1431–1432 (2017)Google Scholar
  8. 8.
    Craswell, N., Croft, W.B., de Rijke, M., Guo, J., Mitra, B.: Report on the second SIGIR workshop on neural information retrieval. SIGIR Forum 51(3), 152–158 (2018)CrossRefGoogle Scholar
  9. 9.
    De Boom, C., Van Canneyt, S., Demeester, T., Dhoedt, B.: Representation learning for very short texts using weighted word embedding aggregation. Pattern Recogn. Lett. 80, 150–156 (2016)CrossRefGoogle Scholar
  10. 10.
    Fan, Y., Guo, J., Lan, Y., Xu, J., Zhai, C., Cheng, X.: Modeling diverse relevance patterns in Ad-hoc retrieval. In: SIGIR 2018, pp. 375–384 (2018)Google Scholar
  11. 11.
    Fan, Y., Pang, L., Hou, J., Guo, J., Lan, Y., Cheng, X.: MatchZoo: A Toolkit for Deep Text Matching. arXiv preprint arXiv:1707.07270 (2017)
  12. 12.
    Fang, H.: A re-examination of query expansion using lexical resources. In: ACL HLT 2008, pp. 139–147 (2008)Google Scholar
  13. 13.
    Fang, H., Tao, T., Zhai, C.: A formal study of information retrieval heuristics. In: SIGIR 2004, pp. 49–56 (2004)Google Scholar
  14. 14.
    Fang, H., Tao, T., Zhai, C.: Diagnostic evaluation of information retrieval models. ACM Trans. Inf. Syst. 29(2), 7:1–7:42 (2011)CrossRefGoogle Scholar
  15. 15.
    Fang, H., Zhai, C.: An exploration of axiomatic approaches to information retrieval. In: SIGIR 2005, pp. 480–487 (2005)Google Scholar
  16. 16.
    Fang, H., Zhai, C.: Semantic term matching in axiomatic approaches to information retrieval. In: SIGIR 2006. pp. 115–122 (2006)Google Scholar
  17. 17.
    Guo, J., Fan, Y., Ai, Q., Croft, W.B.: A deep relevance matching model for ad-hoc retrieval. In: CIKM 2016, pp. 55–64 (2016)Google Scholar
  18. 18.
    Hagen, M., Völske, M., Göring, S., Stein, B.: Axiomatic result re-ranking. In: CIKM 2016, pp. 721–730 (2016)Google Scholar
  19. 19.
    Hu, B., Lu, Z., Li, H., Chen, Q.: Convolutional neural network architectures for matching natural language sentences. In: NIPS 2014, pp. 2042–2050 (2014)Google Scholar
  20. 20.
    Huang, P.S., He, X., Gao, J., Deng, L., Acero, A., Heck, L.: Learning deep structured semantic models for web search using clickthrough data. In: CIKM 2013, pp. 2333–2338 (2013)Google Scholar
  21. 21.
    Hui, K., Yates, A., Berberich, K., de Melo, G.: Co-PACRR: a context-aware neural IR model for ad-hoc retrieval. In: WSDM 2018, pp. 279–287 (2018)Google Scholar
  22. 22.
    Jia, R., Liang, P.: Adversarial examples for evaluating reading comprehension systems. In: EMNLP 2017, pp. 2021–2031 (2017)Google Scholar
  23. 23.
    Johnson, J., Hariharan, B., van der Maaten, L., Fei-Fei, L., Zitnick, C.L., Girshick, R.: CLEVR: A diagnostic dataset for compositional language and elementary visual reasoning. In: CVPR 2017, pp. 1988–1997 (2017)Google Scholar
  24. 24.
    Karimzadehgan, M., Zhai, C.: Axiomatic analysis of translation language model for information retrieval. In: ECIR 2012, pp. 268–280 (2012)Google Scholar
  25. 25.
    Li, C., et al.: NPRF: a neural pseudo relevance feedback framework for ad-hoc information retrieval. In: EMNLP 2018, pp. 4482–4491 (2018)Google Scholar
  26. 26.
    Lu, Z., Li, H.: A deep architecture for matching short texts. In: NIPS 2013, pp. 1367–1375 (2013)Google Scholar
  27. 27.
    Lv, Y., Zhai, C.: Lower-bounding term frequency normalization. In: CIKM 2011, pp. 7–16 (2011)Google Scholar
  28. 28.
    Mitra, B., Craswell, N.: An introduction to neural information retrieval. Found. Trends Inf. Retrieval 13(1), 1–126 (2018)CrossRefGoogle Scholar
  29. 29.
    Mitra, B., Diaz, F., Craswell, N.: Learning to match using local and distributed representations of text for web search. In: WWW 2017, pp. 1291–1299 (2017)Google Scholar
  30. 30.
    Montazeralghaem, A., Zamani, H., Shakery, A.: Axiomatic analysis for improving the log-logistic feedback model. In: SIGIR 2016, pp. 765–768 (2016)Google Scholar
  31. 31.
    Na, S.H.: Two-stage document length normalization for information retrieval. TOIS 2015 33(2), 8:1–8:40 (2015)MathSciNetGoogle Scholar
  32. 32.
    Nie, Y., Li, Y., Nie, J.Y.: Empirical Study of Multi-level Convolution Models for IR Based on Representations and Interactions. In: ICTIR 2018, pp. 59–66 (2018)Google Scholar
  33. 33.
    Onal, K.D., Zhang, Y., Altingovde, I.S., Rahman, M.M., Karagoz, P., Braylan, A., Dang, B., Chang, H.L., Kim, H., McNamara, Q., et al.: Neural information retrieval: at the end of the early years. Inf. Retrieval J. 21(2–3), 111–182 (2018)CrossRefGoogle Scholar
  34. 34.
    Pang, L., Lan, Y., Guo, J., Xu, J., Cheng, X.: A Study of MatchPyramid Models on Ad-hoc Retrieval. arXiv preprint arXiv:1606.04648 (2016)
  35. 35.
    Pang, L., Lan, Y., Guo, J., Xu, J., Cheng, X.: A Deep Investigation of Deep IR Models. arXiv preprint arXiv:1707.07700 (2017)
  36. 36.
    Pang, L., Lan, Y., Guo, J., Xu, J., Wan, S., Cheng, X.: Text matching as image recognition. In: AAAI 2016, pp. 2793–2799 (2016)Google Scholar
  37. 37.
    Pang, L., Lan, Y., Guo, J., Xu, J., Xu, J., Cheng, X.: DeepRank: a new deep architecture for relevance ranking in information retrieval. In: CIKM 2017, pp. 257–266 (2017)Google Scholar
  38. 38.
    Rao, J., Yang, W., Zhang, Y., Ture, F., Lin, J.: Multi-Perspective Relevance Matching with Hierarchical ConvNets for Social Media Search. arXiv preprint arXiv:1805.08159 (2018)
  39. 39.
    Shen, Y., He, X., Gao, J., Deng, L., Mesnil, G.: Learning semantic representations using convolutional neural networks for web search. In: WWW 2014, pp. 373–374 (2014)Google Scholar
  40. 40.
    Shi, S., Wen, J.R., Yu, Q., Song, R., Ma, W.Y.: Gravitation-based model for information retrieval. In: SIGIR 2005, pp. 488–495 (2005)Google Scholar
  41. 41.
    Strohman, T., Metzler, D., Turtle, H., Croft, W.B.: Indri: A language model-based search engine for complex queries. In: International Conference on Intelligence Analysis, pp. 2–6 (2004)Google Scholar
  42. 42.
    Tao, T., Zhai, C.: An exploration of proximity measures in information retrieval. In: SIGIR 2007, pp. 295–302 (2007)Google Scholar
  43. 43.
    Wan, S., Lan, Y., Guo, J., Xu, J., Pang, L., Cheng, X.: A deep architecture for semantic matching with multiple positional sentence representations. In: AAAI 2016, pp. 2835–2841 (2016)Google Scholar
  44. 44.
    Weston, J., Bordes, A., Chopra, S., Rush, A.M., van Merriënboer, B., Joulin, A., Mikolov, T.: Towards AI-Complete Question Answering: A Set of Prerequisite Toy Tasks. arXiv preprint arXiv:1502.05698 (2015)
  45. 45.
    Yang, L., Ai, Q., Guo, J., Croft, W.B.: aNMM: ranking short answer texts with attention-based neural matching model. In: CIKM 2016, pp. 287–296 (2016)Google Scholar
  46. 46.
    Yang, Y., Yih, W.T., Meek, C.: WikiQA: a challenge dataset for open-domain question answering. In: EMNLP 2015, pp. 2013–2018 (2015)Google Scholar
  47. 47.
    Yang, Z., et al.: A deep top-k relevance matching model for ad-hoc retrieval. In: Zhang, S., Liu, T.-Y., Li, X., Guo, J., Li, C. (eds.) CCIR 2018. LNCS, vol. 11168, pp. 16–27. Springer, Cham (2018).  https://doi.org/10.1007/978-3-030-01012-6_2CrossRefGoogle Scholar
  48. 48.
    Zhai, C., Lafferty, J.: A study of smoothing methods for language models applied to ad hoc information retrieval. In: SIGIR 2001, pp. 334–342 (2001)Google Scholar
  49. 49.
    Zhang, X., Zhao, J., LeCun, Y.: Character-level convolutional networks for text classification. In: NIPS 2015, pp. 649–657 (2015)Google Scholar
  50. 50.
    Zoph, B., Le, Q.V.: Neural Architecture Search with Reinforcement Learning. arXiv preprint arXiv:1611.01578 (2016)

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Daniël Rennings
    • 1
  • Felipe Moraes
    • 1
  • Claudia Hauff
    • 1
    Email author
  1. 1.Delft University of TechnologyDelftThe Netherlands

Personalised recommendations