Skip to main content

Placing Query Term Proximity in Search Context

  • Conference paper
  • First Online:

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1240))

Abstract

In the information retrieval system, relevance manifestation is pivotal and regularly based on document-term statistics, i.e. term frequency (tf), inverse document frequency (idf), etc.. Query Term Proximity (QTP) within matched documents is mostly under-explored for the relevance estimation in the information retrieval. In this paper, we systematically review the lineage of the notion of QTP in IR and proposed a novel framework for relevance estimation. The proposed framework is referred as Adaptive QTP based User Information Retrieval (AQtpUIR), is intended to promote the document’s relevance among all relevant retrieved ones. Here, the relevance estimation is a weighted combination of document-term (DT) statistics and query-term (QT) statistics. The notions ‘term-term query proximity’ is a simple aggregation of contextual aspects of user search in relevance estimates and query formation. Intuitively, QTP is exploited to promote the documents for balanced exploitation-exploration, and eventually navigate a search towards goals. The design analysis asserts the usability of QTP measures to balance several seeking tradeoffs, e.g. relevance, novelty, result diversity (Coverage and Topicality), and highlight various inherent challenges and issue of the proposed work.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Baeza-Yates, R., Ribeiro-Neto, B.: Modern Information Retrieval, vol. 463. ACM Press, New York (1999)

    Google Scholar 

  2. Croft, B.: The importance of interaction in information retrieval. In: Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 1–2. ACM, July 2019

    Google Scholar 

  3. Schütze, H., Manning, C.D., Raghavan, P.: Introduction to information retrieval. In: Proceedings of the International Communication of Association for Computing Machinery Conference, p. 260, June 2008

    Google Scholar 

  4. Büttcher, S., Clarke, C.L., Lushman, B.: Term proximity scoring for ad-hoc retrieval on very large text collections. In: Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 621–622. ACM, August 2006

    Google Scholar 

  5. White, R.W.: Interactions with Search Systems. Cambridge University Press, Cambridge (2016)

    Book  Google Scholar 

  6. Croft, W.B., Metzler, D., Strohman, T.: Search Engines: Information Retrieval in Practice, vol. 520. Addison-Wesley, Reading (2010)

    Google Scholar 

  7. Rasolofo, Y., Savoy, J.: Term proximity scoring for keyword-based retrieval systems. In: Sebastiani, F. (ed.) ECIR 2003. LNCS, vol. 2633, pp. 207–218. Springer, Heidelberg (2003). https://doi.org/10.1007/3-540-36618-0_15

    Chapter  MATH  Google Scholar 

  8. Khennak, I., Drias, H.: A novel hybrid correlation measure for query expansion-based information retrieval. In: Critical Approaches to Information Retrieval Research, pp. 1–19. IGI Global (2020)

    Google Scholar 

  9. Idreos, S., Papaemmanouil, O., Chaudhuri, S.: Overview of data exploration techniques. In: ACM SIGMOD International Conference on Management of Data, pp. 277–281 (2015)

    Google Scholar 

  10. Patel, J., Singh, V.: Query morphing: a proximity-based approach for data exploration and query reformulation. In: Ghosh, A., Pal, R., Prasath, R. (eds.) MIKE 2017. LNCS (LNAI), vol. 10682, pp. 261–273. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-71928-3_26

    Chapter  Google Scholar 

  11. Liu, X., Croft, W.B.: Passage retrieval based on language models. In: Proceedings of CIKM 2002, pp. 375–382 (2002)

    Google Scholar 

  12. Song, Y., Hu, Q.V., He, L.: Let terms choose their own kernels: an intelligent approach to kernel selection for healthcare search. Inf. Sci. 485, 55–70 (2019)

    Article  Google Scholar 

  13. Salton, G., Buckley, C.: Term-weighting approaches in automatic text retrieval. Inf. Process. Manag. 24(5), 513–523 (1988)

    Article  Google Scholar 

  14. Paik, J.H.: A novel TF-IDF weighting scheme for effective ranking. In: Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval (2013)

    Google Scholar 

  15. He, B., Huang, J.X., Zhou, X.: Modeling term proximity for probabilistic information retrieval models. Inf. Sci. 181(14), 3017–3031 (2011)

    Article  MathSciNet  Google Scholar 

  16. Miao, J., Huang, J.X., Ye, Z.: Proximity-based rocchio’s model for pseudo relevance. In: Proceedings of the 35th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 535–544. ACM, August 2012

    Google Scholar 

  17. Zhao, J., Huang, J.X., Ye, Z.: Modeling term associations for probabilistic information retrieval. ACM Trans. Inf. Syst. (TOIS) 32(2), 7 (2014)

    Article  Google Scholar 

  18. Saracevic, T.: The notion of relevance in information science: everybody knows what relevance is: But, what is it really? Synth. Lect. Inf. Concepts Retrieval Serv. 8(3), i–109 (2016)

    Google Scholar 

  19. Cummins, R., O’Riordan, C.: Learning in a pairwise term-term proximity framework for information retrieval. In: Proceedings of the 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 251–258, July 2009

    Google Scholar 

  20. Callan, J.P.: Passage-level evidence in document retrieval. In: Croft, W.B., van Rijsbergen, C. (eds.) Proceedings of the Seventeenth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Dublin, Ireland, July 1994, pp. 302–310. Spring-Verlag (1994)

    Google Scholar 

  21. Kaszkiel, M., Zobel, J.: Effective ranking with arbitrary passages. J. Am. Soc. Inf. Sci. 52(4), 344–364 (2001)

    Article  Google Scholar 

  22. Barry, C.L.: User-defined relevance criteria: an exploratory study. J. Am. Soc. Inf. Sci. 45(3), 149–159 (1994)

    Article  Google Scholar 

  23. Robertson, S., Zaragoza, H.: The probabilistic relevance framework: BM25 and beyond. Found. Trends® Inf. Retrieval 3(4), 333–389 (2009)

    Article  Google Scholar 

  24. Büttcher, S., Clarke, C.L.A.: Efficiency vs. effectiveness in terabyte-scale information retrieval. In: TREC (2005)

    Google Scholar 

  25. He, B., Ounis, I.: Term frequency normalisation tuning for BM25 and DFR models. In: Losada, D.E., Fernández-Luna, J.M. (eds.) ECIR 2005. LNCS, vol. 3408, pp. 200–214. Springer, Heidelberg (2005). https://doi.org/10.1007/978-3-540-31865-1_15

    Chapter  Google Scholar 

  26. Song, F., Croft, B.: A general language model for information retrieval. In: Proceedings of the 1999 ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 279–280 (1999)

    Google Scholar 

  27. Salton, G., Allan, J., Buckley, C.: Approaches to passage retrieval in full text information systems. In: Proceedings of the 16th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 49–58 (1993)

    Google Scholar 

  28. Beigbeder, M., Mercier, A.: An information retrieval model using the fuzzy proximity degree of term occurences. In: Proceedings of the 2005 ACM Symposium on Applied Computing. ACM (2005)

    Google Scholar 

  29. Clarke, C.L.A., Cormack, G.V., Burkowski, F.J.: Shortest substring ranking (MultiText experiments for TREC-4). In: TREC, vol. 4 (1995)

    Google Scholar 

  30. Hawking, D., Thistlewaite, P.: Proximity operators-so near and yet so far. In: Proceedings of the 4th Text Retrieval Conference (1995)

    Google Scholar 

  31. Singh, V.: Predicting search intent based on in-search context for exploratory search. Int. J. Adv. Pervasive Ubiquit. Comput. (IJAPUC) 11(3), 53–75 (2019)

    Article  Google Scholar 

  32. Singh, V., Dave, M.: Improving result diversity using query term proximity in exploratory search. In: Madria, S., Fournier-Viger, P., Chaudhary, S., Reddy, P.K. (eds.) BDA 2019. LNCS, vol. 11932, pp. 67–87. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-37188-3_5

    Chapter  Google Scholar 

  33. Arroyuelo, D., et al.: To index or not to index: time-space trade-offs for positional ranking functions in search engines. Inf. Syst. (2019). https://doi.org/10.1016/j.is.2019.101466

  34. Zhao, J., Yun, Y.: A proximity language model for information retrieval. In: Proceedings of the 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 291–298. ACM, July 2009

    Google Scholar 

  35. Song, R., Taylor, M.J., Wen, J.-R., Hon, H.-W., Yu, Y.: Viewing term proximity from a different perspective. In: Macdonald, C., Ounis, I., Plachouras, V., Ruthven, I., White, R.W. (eds.) ECIR 2008. LNCS, vol. 4956, pp. 346–357. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-78646-7_32

    Chapter  Google Scholar 

  36. Qiao, Y., Du, Q., Wan, D.: A study on query terms proximity embedding for information retrieval. Int. J. Distrib. Sens. Netw. 13(2) (2017). https://doi.org/10.1177/1550147717694891

  37. Pitis, S.: Methods for retrieving alternative contract language using a prototype. In: Proceedings of the 16th Edition of the International Conference on Articial Intelligence and Law. ACM (2017)

    Google Scholar 

  38. Veretennikov, A.B.: Proximity full-text search by means of additional indexes with multi-component keys: in pursuit of optimal performance. In: Manolopoulos, Y., Stupnikov, S. (eds.) DAMDID/RCDL 2018. CCIS, vol. 1003, pp. 111–130. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-23584-0_7

    Chapter  Google Scholar 

  39. Pan, M., et al.: An adaptive term proximity based rocchio’s model for clinical decision support retrieval. BMC Med. Inform. Decis. Mak. 19(9) (2019). Article number: 251. https://doi.org/10.1186/s12911-019-0986-6

  40. Schenkel, R., Broschart, A., Hwang, S., Theobald, M., Weikum, G.: Efficient text proximity search. In: Ziviani, N., Baeza-Yates, R. (eds.) SPIRE 2007. LNCS, vol. 4726, pp. 287–299. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-75530-2_26

    Chapter  Google Scholar 

  41. Svore, K.M., Kanani, P.H., Khan, N.: How good is a span of terms? Exploiting proximity to improve web retrieval. In: Proceedings of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 154–161. ACM, July 2010

    Google Scholar 

  42. Arroyuelo, D., et al.: To index or not to index: time-space trade-offs for positional ranking functions in search engines. Inf. Syst. (2019). https://doi.org/10.1016/j.is.2019.101466

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Vikram Singh .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Barik, T., Singh, V. (2020). Placing Query Term Proximity in Search Context. In: Bhattacharjee, A., Borgohain, S., Soni, B., Verma, G., Gao, XZ. (eds) Machine Learning, Image Processing, Network Security and Data Sciences. MIND 2020. Communications in Computer and Information Science, vol 1240. Springer, Singapore. https://doi.org/10.1007/978-981-15-6315-7_1

Download citation

  • DOI: https://doi.org/10.1007/978-981-15-6315-7_1

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-15-6314-0

  • Online ISBN: 978-981-15-6315-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics