Skip to main content

Boosting Web Retrieval Through Query Operations

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 3408))

Abstract

We explore the use of phrase and proximity terms in the context of web retrieval, which is different from traditional ad-hoc retrieval both in document structure and in query characteristics. We show that for this type of task, the usage of both phrase and proximity terms is highly beneficial for early precision as well as for overall retrieval effectiveness. We also analyze why phrase and proximity terms are far more effective for web retrieval than for ad-hoc retrieval.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Ahn, D., Jijkoun, V., Kamps, J., Mishne, G., Müller, K., de Rijke, M., Schlobach, S.: The University of Amsterdam at TREC 2004. In: TREC 2004 Conference Notebook, Gaithersburg, Maryland USA (2004)

    Google Scholar 

  2. Amitay, E., Carmel, D., Darlow, A., Herscovici, M., Kraft, R., Lempel, R., Soffer, A., Zien, J.: Juru at TREC 2003 - Topic Distillation using Query-Sensitive Tuning and Cohesiveness Filtering. In: Proceedings of the 12th Text REtrieval Conference (2003)

    Google Scholar 

  3. Arampatzis, A.T., van der Weide, T.P., Koster, C.H.A., van Bommel, P.: An Evaluation of Linguistically-motivated Indexing Schemes. In: Proceedings of the 22nd BCS-IRSG Colloquium on IR Research (2000)

    Google Scholar 

  4. Baeza-Yates, R.A., Ribeiro-Neto, B.A.: Modern Information Retrieval. ACM Press / Addison-Wesley (1999)

    Google Scholar 

  5. Bartell, B.T., Cottrell, G.W., Belew, R.K.: Automatic Combination of Multiple Ranked Retrieval Systems. In: Research and Development in Information Retrieval, pp. 173–181 (1994)

    Google Scholar 

  6. Brill, E., Dumais, S., Banko, M.: An analysis of the AskMSR question-answering system. In: Proceedings 39th Annual ACL (2002)

    Google Scholar 

  7. Cacheda, F., Vina, A.: Understanding how people use search engines: a statistical analysis for e-business. In: Proceedings of the e-Business and e-Work Conference and Exhibition, Venice, Italy, October 2001, pp. 319–325 (2001)

    Google Scholar 

  8. Chakrabarti, S.: Mining the Web: Analysis of Hypertext and Semi Structured Data. Morgan Kaufmann, San Francisco (2002)

    Google Scholar 

  9. Clarke, C.L.A., Cormack, G.V.: Shortest-substring retrieval and ranking. ACM Transactions on Information Systems (TOIS) 18(1), 44–78 (2000)

    Article  Google Scholar 

  10. Craswell, N., Hawking, D.: Overview of the TREC-2002 web track. In: Proceedings of TREC-2002, Gaithersburg, Maryland USA (November 2002)

    Google Scholar 

  11. Craswell, N., Hawking, D., Wilkinson, R., Wu, M.: Overview of the TREC-2003 web track. In: Proceedings of TREC 2003, Gaithersburg, Maryland USA (November 2003)

    Google Scholar 

  12. Croft, W.B., Turtle, H.R., Lewis, D.D.: The use of phrases and structured queries in information retrieval. In: Proceedings of the 14th annual international ACM SIGIR conference on Research and development in information retrieval, Chicago, Illinois, United States, pp. 32–45. ACM Press, New York (1991)

    Chapter  Google Scholar 

  13. Craswell, N., et al.: Overview of the TREC-2004 web track. In: Proceedings 13th Text REtrieval Conference, Gaithersburg, Maryland USA (2004) (to appear)

    Google Scholar 

  14. Fagan, J.L.: Experiments in automatic phrase indexing for document retrieval: A comparison of syntactic and non-syntactic methods. Technical report, Cornell University (1987)

    Google Scholar 

  15. Fuhr, N., Lalmas, M., Malik, S. (eds.): INEX 2003 Workshop Proceedings (2004)

    Google Scholar 

  16. Hawking, D., Thistlewaite, P.: Proximity operators—So near and yet so far. In: Proceedings TREC-4, pp. 131–143 (1996)

    Google Scholar 

  17. Hawking, D., Thistlewaite, P.: Relevance weighting using distance between term occurrences. Technical Report TR-CS-96-08, Department of Computer Science, Australian National University (1996)

    Google Scholar 

  18. Hersh, W., Bhupatiraju, R.T.: TREC GENOMICS Track Overview. In: Proceedings TREC 2003, pp. 14–23 (2004)

    Google Scholar 

  19. Hull, D.A., Grefenstette, G., Schultze, B.M., Gaussier, E., Schutze, H., Pedersen, J.O.: Xerox TREC-5 Site Report: Routing, Filtering, NLP, and Spanish Tracks. In: Proceedings TREC-5, pp. 167–180 (1997)

    Google Scholar 

  20. Jansen, B.J., Spink, A., Saracevic, T.: Real life, real users, and real needs: a study and analysis of user queries on the web. Information Processing and Management 36(2), 207–227 (2000)

    Article  Google Scholar 

  21. Kamps, J., Mishne, G., de Rijke, M.: The University of Amsterdam at TREC 2004. In: Proceedings of the 13th Text REtrieval Conference (2004) (to appear)

    Google Scholar 

  22. Keen, E.M.: Term position ranking: some new test results. In: Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval, pp. 66–76. ACM Press, New York (1992)

    Chapter  Google Scholar 

  23. Kraaij, W., Pohlmann, R.: Comparing the effect of syntactic vs. Statistical phrase indexing strategies for dutch. In: Nikolaou, C., Stephanidis, C. (eds.) ECDL 1998. LNCS, vol. 1513, pp. 605–617. Springer, Heidelberg (1998)

    Chapter  Google Scholar 

  24. Mitra, M., Buckley, C., Singhal, A., Cardie, C.: An analysis of statistical and syntactic phrases. In: Proceedings of RIAO 1997 (1997)

    Google Scholar 

  25. Mittal, V., Baluja, S., Sahami, M.: Google tutorial on web information retrieval. In: RIAO 2004 (2004)

    Google Scholar 

  26. Ogilvie, P., Callan, J.: Combining document representations for known-item search. In: Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval. ACM Press, New York (2003)

    Google Scholar 

  27. Pickens, J., Croft, W.B.: An exploratory analysis of phrases in text retrieval. In: Proceedings of RIAO 2000 (2000)

    Google Scholar 

  28. Rasolofo, Y., Savoy, J.: Term proximity scoring for keyword-based retrieval systems. In: Sebastiani, F. (ed.) ECIR 2003. LNCS, vol. 2633, pp. 207–218. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  29. Salton, G., McGill, M.J.: Introduction to Modern Information Retrieval. McGraw-Hill Inc, New York (1986)

    Google Scholar 

  30. Savoy, J., Rasolofo, Y., Perret, L.: Report on the TREC-2003 experiment: Genomic and web searches. In: Proceedings TREC 2003, pp. 739–750 (2004)

    Google Scholar 

  31. Spink, A., Jansen, B.J., Wolfram, D., Saracevic, T.: From e-sex to e-commerce: Web search changes. Computer 35(3), 107–109 (2002)

    Article  Google Scholar 

  32. Spink, A., Wolfram, D., Jansen, B.J., Saracevic, T.: Searching the web: the public and their queries. Journal of the American Society for Information Science and Technology 52(3), 226–234 (2001)

    Article  Google Scholar 

  33. Wen, J., Song, R., Cai, D., Zhu, K., Yu, S., Ye, S., Ma, W.-Y.: Microsoft Research Asia at the Web Track of TREC 2003. In: Proceedings TREC 2003, pp. 408–417 (2004)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Mishne, G., de Rijke, M. (2005). Boosting Web Retrieval Through Query Operations. In: Losada, D.E., Fernández-Luna, J.M. (eds) Advances in Information Retrieval. ECIR 2005. Lecture Notes in Computer Science, vol 3408. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-31865-1_36

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-31865-1_36

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-25295-5

  • Online ISBN: 978-3-540-31865-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics