Encyclopedia of Database Systems

2018 Edition
| Editors: Ling Liu, M. Tamer Özsu

WEB Information Retrieval Models

  • Craig MacDonaldEmail author
  • Iadh Ounis
Reference work entry
DOI: https://doi.org/10.1007/978-1-4614-8265-9_928


Web search engines


The Web can be considered as a large-scale document collection, for which classical text retrieval techniques can be applied. However, its unique features and structure offer new sources of evidence that can be used to enhance the effectiveness of Information Retrieval (IR) systems. Generally, Web IR examines the combination of evidence from both the textual content of documents and the structure of the Web, as well as the search behavior of users and issues related to the evaluation of retrieval effectiveness in the Web setting.

Web Information Retrieval models are ways of integrating many sources of evidence about documents, such as the links, the structure of the document, the actual content of the document, the quality of the document, etc. so that an effective Web search engine can be achieved. In contrast with the traditional library-type settings of IR systems, the Web is a hostile environment, where Web search engines have to deal with...

This is a preview of subscription content, log in to check access.

Recommended Reading

  1. 1.
    Brin S, Page L. The anatomy of a large-scale hypertextual Web search engine. Comput Netw ISDN Syst. 1998;30(1–7):107–17.CrossRefGoogle Scholar
  2. 2.
    Craswell N, Hawking D, Robertson S. Effective site finding using link anchor information. In: Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval; 2001. p. 250–7.Google Scholar
  3. 3.
    Craswell N, Robertson S, Zaragoza H, Taylor M. Relevance weighting for query independent evidence. In: Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval; 2005. p. 416–23.Google Scholar
  4. 4.
    Hawking D, Craswell N. The very large collection and Web tracks. In: TREC: experiment and evaluation in information retrieval. Dordrecht: Kluwer Academic Publishers; 2004. p. 199–232.Google Scholar
  5. 5.
    Joachims T, Li H, Liu TY, Zhai C. SIGIR workshop report: learning to rank for information retrieval (LR4IR 2007). SIGIR Forum. 2007;41(2):55–62.CrossRefGoogle Scholar
  6. 6.
    Kleinberg JM. Authoritative sources in a hyperlinked environment. J. ACM. 1999;46(5):604–32.MathSciNetzbMATHCrossRefGoogle Scholar
  7. 7.
    Kraaij W, Westerveld T, Hiemstra D. The importance of prior probabilities for entry page search. In: Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval; 2002. p. 27–34.Google Scholar
  8. 8.
    Macdonald C, Plachouras V, He B, Lioma C, Ounis I. University of Glasgow at WebCLEF 2005: Experiments in per-field normlisation and language specific stemming. In: Proceedings of the 6th Workshop, Cross-Language Evaluation Forum; 2005. p. 898–907.CrossRefGoogle Scholar
  9. 9.
    Ogilvie P Callan J. Combining document representations for known-item search. In: Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval; 2003. p. 143–50.Google Scholar
  10. 10.
    Peng J, Macdonald C, He B, Ounis I. Combination of document priors in Web information retrieval. In: Proceedings of the 8th International Conference on Computer-Assisted Information Retrieval; 2007.Google Scholar
  11. 11.
    Plachouras V. Selective web information retrieval. PhD thesis, Department of Computing Science, University of Glasgow. 2006.Google Scholar
  12. 12.
    Plachouras V Ounis I. Multinomial randomness models for retrieval with document fields. In: Proceedings of the 29th European conference on IR research; 2007. p. 28–39.Google Scholar
  13. 13.
    Plachouras V, Ounis I, Amati G. The static absorbing model for the Web. J Web Eng. 2005;4(2):165–86.Google Scholar
  14. 14.
    Robertson S, Zaragoza H, Taylor M. Simple BM25 extension to multiple weighted fields. In: Proceedings of the 13th ACM International Conference on Information and Knowledge Management; 2004. p. 42–9.Google Scholar
  15. 15.
    Silverstein C, Henzinger M, Marais H, Moricz M. Analysis of a very large AltaVista Query Log. Technical report 1998–014, Digital SRC. 1998.Google Scholar
  16. 16.
    Zaragoza H, Craswell N, Taylor M, Saria S, Robertson S. Microsoft cambridge at TREC-13: Web and HARD tracks. In: Proceedings of the 4th Text Retrieval Conference; 2004.Google Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.University of GlasgowGlasgowUK

Section editors and affiliations

  • Giambattista Amati
    • 1
  1. 1.Fondazione Ugo BordoniRomeItaly