Probabilistic Relevance Models Based on Document and Query Generation

  • John Lafferty
  • ChengXiang Zhai
Part of the The Springer International Series on Information Retrieval book series (INRE, volume 13)


We give a unified account of the probabilistic semantics underlying the language modeling approach and the traditional probabilistic model for information retrieval, showing that the two approaches can be viewed as being equivalent probabilistically, since they are based on different factorizations of the same generative relevance model. We also discuss how the two approaches lead to different retrieval frameworks in practice, since they involve component models that are estimated quite differently.


Language models relevance models generative models 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Berger, A. and Lafferty, J. (1999). Information retrieval as statistical transla tion. In Proceedings of the 22nd International Conference on Research and Development in Information Retrieval (SIGIR’99), pages 222–229.Google Scholar
  2. Brin, S. and Page, L. (1998). Anatomy of a large-scale hypertextual web search engine. In Proceedings of the 7th International World Wide Web Conference. Google Scholar
  3. Brown, P. F., Cocke, J., Della Pietra, S. A., Della Pietra, V. J., Jelinek, F., Lafferty, J. D., Mercer, R. L., and Roossin, P. S. (1990). A statistical approach to machine translation. Computational Linguistics, 16 (2): 79–85.Google Scholar
  4. Croft, W. B. and Harper, D. (1979). Using probabilistic models of document retrieval without relevance information. Journal of Documentation, 35: 285295.Google Scholar
  5. Fuhr, N. (1992). Probabilistic models in information retrieval. Computer Journal, 35: 243–255.zbMATHCrossRefGoogle Scholar
  6. Lavrenko, V. and Croft, W. B. (2001). Relevance-based language models. In 24th ACM SIGIR Conference on Research and Development in Information Retrieval. Google Scholar
  7. Maron, M. and Kuhns, J. (1960). On relevance, probabilistic indexing and information retrieval. Journal of the ACM, 7: 216–244.CrossRefGoogle Scholar
  8. Miller, D., Leek, T., and Schwartz, R. (1999). A hidden Markov model information retrieval system. In Proceedings of the 22nd International Conference on Research and Development in Information Retrieval (SIGIR’99), pages 214–221.Google Scholar
  9. Ponte, J. and Croft, W. B. (1998). A language modeling approach to information retrieval. In Proceedings of the 21st International Conference on Research and Development in Information Retrieval (SIGIR’98), pages 275–281Google Scholar
  10. Robertson, S. (1977). The probability ranking principle in IR. Journal of Documentation, 33: 294–304.CrossRefGoogle Scholar
  11. Robertson, S. and Sparck Jones, K. (1976). Relevance weighting of search terms. Journal of the American Society for Information Science, 27: 129–146.CrossRefGoogle Scholar
  12. Robertson, S. and Walker, S. (1997). On relevance weights with little relevance information. In Proceedings of SIGIR’97, pages 16–24.Google Scholar
  13. Robertson, S. E. (1994). Query-document symmetry and dual models. Journal of Documentation, 50 (3): 233–238.CrossRefGoogle Scholar
  14. Robertson, S. E., Maron, M. E., and Cooper, W. S. (1982). Probability of relevance: a unification of two competing models for information retrieval. Information Technology - Research and Development, 1:1–21.Google Scholar
  15. Sparck Jones, K., Walker, S., and Robertson, S. E. (2000). A probabilistic model of information retrieval: development and comparative experiments, Part 1. Information Processing and Management, 36: 779–808.CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media Dordrecht 2003

Authors and Affiliations

  • John Lafferty
    • 1
  • ChengXiang Zhai
    • 2
  1. 1.School of Computer ScienceCarniegie Mellon UniversityUSA
  2. 2.Department of Computer ScienceUniversity of Illinois at Urbana-ChampaignUSA

Personalised recommendations