Advertisement

Placing Query Term Proximity in Search Context

  • Tirthankar Barik
  • Vikram SinghEmail author
Conference paper
  • 42 Downloads
Part of the Communications in Computer and Information Science book series (CCIS, volume 1240)

Abstract

In the information retrieval system, relevance manifestation is pivotal and regularly based on document-term statistics, i.e. term frequency (tf), inverse document frequency (idf), etc.. Query Term Proximity (QTP) within matched documents is mostly under-explored for the relevance estimation in the information retrieval. In this paper, we systematically review the lineage of the notion of QTP in IR and proposed a novel framework for relevance estimation. The proposed framework is referred as Adaptive QTP based User Information Retrieval (AQtpUIR), is intended to promote the document’s relevance among all relevant retrieved ones. Here, the relevance estimation is a weighted combination of document-term (DT) statistics and query-term (QT) statistics. The notions ‘term-term query proximity’ is a simple aggregation of contextual aspects of user search in relevance estimates and query formation. Intuitively, QTP is exploited to promote the documents for balanced exploitation-exploration, and eventually navigate a search towards goals. The design analysis asserts the usability of QTP measures to balance several seeking tradeoffs, e.g. relevance, novelty, result diversity (Coverage and Topicality), and highlight various inherent challenges and issue of the proposed work.

Keywords

Big data analytics Exploratory search Relevance manifestation Information retrieval 

References

  1. 1.
    Baeza-Yates, R., Ribeiro-Neto, B.: Modern Information Retrieval, vol. 463. ACM Press, New York (1999)Google Scholar
  2. 2.
    Croft, B.: The importance of interaction in information retrieval. In: Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 1–2. ACM, July 2019Google Scholar
  3. 3.
    Schütze, H., Manning, C.D., Raghavan, P.: Introduction to information retrieval. In: Proceedings of the International Communication of Association for Computing Machinery Conference, p. 260, June 2008Google Scholar
  4. 4.
    Büttcher, S., Clarke, C.L., Lushman, B.: Term proximity scoring for ad-hoc retrieval on very large text collections. In: Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 621–622. ACM, August 2006Google Scholar
  5. 5.
    White, R.W.: Interactions with Search Systems. Cambridge University Press, Cambridge (2016)CrossRefGoogle Scholar
  6. 6.
    Croft, W.B., Metzler, D., Strohman, T.: Search Engines: Information Retrieval in Practice, vol. 520. Addison-Wesley, Reading (2010)Google Scholar
  7. 7.
    Rasolofo, Y., Savoy, J.: Term proximity scoring for keyword-based retrieval systems. In: Sebastiani, F. (ed.) ECIR 2003. LNCS, vol. 2633, pp. 207–218. Springer, Heidelberg (2003).  https://doi.org/10.1007/3-540-36618-0_15CrossRefzbMATHGoogle Scholar
  8. 8.
    Khennak, I., Drias, H.: A novel hybrid correlation measure for query expansion-based information retrieval. In: Critical Approaches to Information Retrieval Research, pp. 1–19. IGI Global (2020)Google Scholar
  9. 9.
    Idreos, S., Papaemmanouil, O., Chaudhuri, S.: Overview of data exploration techniques. In: ACM SIGMOD International Conference on Management of Data, pp. 277–281 (2015)Google Scholar
  10. 10.
    Patel, J., Singh, V.: Query morphing: a proximity-based approach for data exploration and query reformulation. In: Ghosh, A., Pal, R., Prasath, R. (eds.) MIKE 2017. LNCS (LNAI), vol. 10682, pp. 261–273. Springer, Cham (2017).  https://doi.org/10.1007/978-3-319-71928-3_26CrossRefGoogle Scholar
  11. 11.
    Liu, X., Croft, W.B.: Passage retrieval based on language models. In: Proceedings of CIKM 2002, pp. 375–382 (2002)Google Scholar
  12. 12.
    Song, Y., Hu, Q.V., He, L.: Let terms choose their own kernels: an intelligent approach to kernel selection for healthcare search. Inf. Sci. 485, 55–70 (2019)CrossRefGoogle Scholar
  13. 13.
    Salton, G., Buckley, C.: Term-weighting approaches in automatic text retrieval. Inf. Process. Manag. 24(5), 513–523 (1988)CrossRefGoogle Scholar
  14. 14.
    Paik, J.H.: A novel TF-IDF weighting scheme for effective ranking. In: Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval (2013)Google Scholar
  15. 15.
    He, B., Huang, J.X., Zhou, X.: Modeling term proximity for probabilistic information retrieval models. Inf. Sci. 181(14), 3017–3031 (2011)MathSciNetCrossRefGoogle Scholar
  16. 16.
    Miao, J., Huang, J.X., Ye, Z.: Proximity-based rocchio’s model for pseudo relevance. In: Proceedings of the 35th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 535–544. ACM, August 2012Google Scholar
  17. 17.
    Zhao, J., Huang, J.X., Ye, Z.: Modeling term associations for probabilistic information retrieval. ACM Trans. Inf. Syst. (TOIS) 32(2), 7 (2014)CrossRefGoogle Scholar
  18. 18.
    Saracevic, T.: The notion of relevance in information science: everybody knows what relevance is: But, what is it really? Synth. Lect. Inf. Concepts Retrieval Serv. 8(3), i–109 (2016)Google Scholar
  19. 19.
    Cummins, R., O’Riordan, C.: Learning in a pairwise term-term proximity framework for information retrieval. In: Proceedings of the 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 251–258, July 2009Google Scholar
  20. 20.
    Callan, J.P.: Passage-level evidence in document retrieval. In: Croft, W.B., van Rijsbergen, C. (eds.) Proceedings of the Seventeenth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Dublin, Ireland, July 1994, pp. 302–310. Spring-Verlag (1994)Google Scholar
  21. 21.
    Kaszkiel, M., Zobel, J.: Effective ranking with arbitrary passages. J. Am. Soc. Inf. Sci. 52(4), 344–364 (2001)CrossRefGoogle Scholar
  22. 22.
    Barry, C.L.: User-defined relevance criteria: an exploratory study. J. Am. Soc. Inf. Sci. 45(3), 149–159 (1994)CrossRefGoogle Scholar
  23. 23.
    Robertson, S., Zaragoza, H.: The probabilistic relevance framework: BM25 and beyond. Found. Trends® Inf. Retrieval 3(4), 333–389 (2009)CrossRefGoogle Scholar
  24. 24.
    Büttcher, S., Clarke, C.L.A.: Efficiency vs. effectiveness in terabyte-scale information retrieval. In: TREC (2005)Google Scholar
  25. 25.
    He, B., Ounis, I.: Term frequency normalisation tuning for BM25 and DFR models. In: Losada, D.E., Fernández-Luna, J.M. (eds.) ECIR 2005. LNCS, vol. 3408, pp. 200–214. Springer, Heidelberg (2005).  https://doi.org/10.1007/978-3-540-31865-1_15CrossRefGoogle Scholar
  26. 26.
    Song, F., Croft, B.: A general language model for information retrieval. In: Proceedings of the 1999 ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 279–280 (1999)Google Scholar
  27. 27.
    Salton, G., Allan, J., Buckley, C.: Approaches to passage retrieval in full text information systems. In: Proceedings of the 16th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 49–58 (1993)Google Scholar
  28. 28.
    Beigbeder, M., Mercier, A.: An information retrieval model using the fuzzy proximity degree of term occurences. In: Proceedings of the 2005 ACM Symposium on Applied Computing. ACM (2005)Google Scholar
  29. 29.
    Clarke, C.L.A., Cormack, G.V., Burkowski, F.J.: Shortest substring ranking (MultiText experiments for TREC-4). In: TREC, vol. 4 (1995)Google Scholar
  30. 30.
    Hawking, D., Thistlewaite, P.: Proximity operators-so near and yet so far. In: Proceedings of the 4th Text Retrieval Conference (1995)Google Scholar
  31. 31.
    Singh, V.: Predicting search intent based on in-search context for exploratory search. Int. J. Adv. Pervasive Ubiquit. Comput. (IJAPUC) 11(3), 53–75 (2019)CrossRefGoogle Scholar
  32. 32.
    Singh, V., Dave, M.: Improving result diversity using query term proximity in exploratory search. In: Madria, S., Fournier-Viger, P., Chaudhary, S., Reddy, P.K. (eds.) BDA 2019. LNCS, vol. 11932, pp. 67–87. Springer, Cham (2019).  https://doi.org/10.1007/978-3-030-37188-3_5CrossRefGoogle Scholar
  33. 33.
    Arroyuelo, D., et al.: To index or not to index: time-space trade-offs for positional ranking functions in search engines. Inf. Syst. (2019).  https://doi.org/10.1016/j.is.2019.101466
  34. 34.
    Zhao, J., Yun, Y.: A proximity language model for information retrieval. In: Proceedings of the 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 291–298. ACM, July 2009Google Scholar
  35. 35.
    Song, R., Taylor, M.J., Wen, J.-R., Hon, H.-W., Yu, Y.: Viewing term proximity from a different perspective. In: Macdonald, C., Ounis, I., Plachouras, V., Ruthven, I., White, R.W. (eds.) ECIR 2008. LNCS, vol. 4956, pp. 346–357. Springer, Heidelberg (2008).  https://doi.org/10.1007/978-3-540-78646-7_32CrossRefGoogle Scholar
  36. 36.
    Qiao, Y., Du, Q., Wan, D.: A study on query terms proximity embedding for information retrieval. Int. J. Distrib. Sens. Netw. 13(2) (2017).  https://doi.org/10.1177/1550147717694891
  37. 37.
    Pitis, S.: Methods for retrieving alternative contract language using a prototype. In: Proceedings of the 16th Edition of the International Conference on Articial Intelligence and Law. ACM (2017)Google Scholar
  38. 38.
    Veretennikov, A.B.: Proximity full-text search by means of additional indexes with multi-component keys: in pursuit of optimal performance. In: Manolopoulos, Y., Stupnikov, S. (eds.) DAMDID/RCDL 2018. CCIS, vol. 1003, pp. 111–130. Springer, Cham (2019).  https://doi.org/10.1007/978-3-030-23584-0_7CrossRefGoogle Scholar
  39. 39.
    Pan, M., et al.: An adaptive term proximity based rocchio’s model for clinical decision support retrieval. BMC Med. Inform. Decis. Mak. 19(9) (2019). Article number: 251.  https://doi.org/10.1186/s12911-019-0986-6
  40. 40.
    Schenkel, R., Broschart, A., Hwang, S., Theobald, M., Weikum, G.: Efficient text proximity search. In: Ziviani, N., Baeza-Yates, R. (eds.) SPIRE 2007. LNCS, vol. 4726, pp. 287–299. Springer, Heidelberg (2007).  https://doi.org/10.1007/978-3-540-75530-2_26CrossRefGoogle Scholar
  41. 41.
    Svore, K.M., Kanani, P.H., Khan, N.: How good is a span of terms? Exploiting proximity to improve web retrieval. In: Proceedings of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 154–161. ACM, July 2010Google Scholar
  42. 42.
    Arroyuelo, D., et al.: To index or not to index: time-space trade-offs for positional ranking functions in search engines. Inf. Syst. (2019).  https://doi.org/10.1016/j.is.2019.101466

Copyright information

© Springer Nature Singapore Pte Ltd. 2020

Authors and Affiliations

  1. 1.National Institute of Technology, KurukshetraKurukshetraIndia

Personalised recommendations