Probabilistic Learning by Uncertainty Sampling with Non-Binary Relevance

  • Gianni Amati
  • Fabio Crestani
Part of the Studies in Fuzziness and Soft Computing book series (STUDFUZZ, volume 50)


We present a learning model for probabilistic learning in information retrieval and information filtering which is based on the concept of “uncertainty sampling”. Uncertainty sampling is a technique that exploits user relevance feed-back both for relevant and non-relevant documents. In particular, relevance sampling uses those documents whose relevance is most uncertain to speed up the learning of the user relevance criteria. We extend the use of uncertainty sampling by considering multiple levels of relevance and we show how this new learning model for information retrieval and filtering could be evaluated using collections with non-binary relevance assessments.


Information Retrieval Relevant Document Relevance Feedback Information Retrieval System Relevance Judgement 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Allan, J. (1996). Incremental relevance feedback for Information filtering. In Proceedings of ACM SIGIR, pages 270–278, Zürich, Switzerland.Google Scholar
  2. 2.
    Amati, G. and Crestani, F. (1999). Probabilistic learning for selective dissem-ination of information. Information Processing and Management In press.Google Scholar
  3. 3.
    Amati, G., Crestani, F., Ubaldini, F., and De Nardis, S. (1997). Probabilistic learning for information filtering. In Proceedings of the RIAO Conference, volume 1, pages 513–530, Montreal, Canada.Google Scholar
  4. 4.
    Amati, G. and van Rijsbergen, C. (1995). Probability, information and Information Retrieval. In Proceedings of the First International Workshop on Information Retrieval, Uncertanty and Logic, Glasgow, Scotland, UK.Google Scholar
  5. 5.
    Amati, G. and van Rijsbergen, C. (1998). Semantic Information Retrieval. In Crestani, F., Lalmas, M., and van Rijsbergen, C, editors, Information Retrieval: Uncertainty and Logics, pages 189–220. Kluwer Academic Publishers, Norwell, MA, USA.CrossRefGoogle Scholar
  6. 6.
    Belew, R. (1996). Rave reviews: acquiring relevance assessments from multiple users. In Proceedings of the AAAI Spring Symposium on Machine Learning in Information Access, Stanford, CA, USA.Google Scholar
  7. 7.
    Belkin, N. and Croft, W. (1992). Information Filtering and Information Retrieval: two sides of the same coin? Communications ofthe ACM, 35(12):29–38.CrossRefGoogle Scholar
  8. 8.
    Callan, J. (1996). Document filtering with inference networks. In Proceedings of ACM SIGIR, pages 262–269, Zürich, Switzerland.Google Scholar
  9. 9.
    Carnap, R. (1950). Logical Foundations of probability. Routledge and Kegan Paul Ltd, London, UK.MATHGoogle Scholar
  10. 10.
    Cleverdon, C, Mills, J., and Keen, M. (1966). ASLIB Cranfield Research Project: factors determining the Performance of indexing Systems. ASLIB.Google Scholar
  11. 11.
    Cooper, W. (1971). A definition of relevance for Information Retrieval. Information Storage and Retrieval, 7:19–37.CrossRefGoogle Scholar
  12. 12.
    Crestani, F., Lalmas, M., van Rijsbergen, C, and Campbell, I. (1998). Is this document relevant?…probably. A survey of probabilistic models in Information Retrieval. ACM Computing Surveys, 30(4):528–552.CrossRefGoogle Scholar
  13. 13.
    Cuadra, C. and Katter, R. (1967). Opening the black box of relevance. Journal of Documentation, 23(4):291–303.CrossRefGoogle Scholar
  14. 14.
    Ghosh, G. (1991). A brief history of sequential analisys. Marcel Dekker, New York, USA.Google Scholar
  15. 15.
    Harman, D. (1992). Relevance feedback and other query modification tech-niques. In Frakes, W. and Baeza-Yates, R., editors, Information Retrieval: data structures and algorithms, chapter 11. Prentice Hall, Englewood Cliffs, New Jersey, USA.Google Scholar
  16. 16.
    Harman, D. (1996). Overview of the fifth text retrieval Conference (TREC-5). In Proceedings of the TREC Conference, Gaithersburg, MD, USA.Google Scholar
  17. 17.
    Harter, S. (1996). Variations in relevance assessments and the measurements of retrieval effectiveness. Journal ofthe American Society for Information Science, 47(l):37–49.CrossRefGoogle Scholar
  18. 18.
    Hintikka, J. (1970). On semantic information. In Information and inference. Synthese Library, Reidel, Dordrecht, The Netherlands.CrossRefGoogle Scholar
  19. 19.
    Lewis, D. (1995). A sequential algorithm for training text classifiers: corrigen-dum and additional data. SIGIR FORUM, 29(2):13–19.CrossRefGoogle Scholar
  20. 20.
    Lewis, D. and Gale, W. (1994). A sequential algorithm for training classifiers. In Proceedings of ACM SIGIR, pages 3–11, Dublin, Ireland.Google Scholar
  21. 21.
    Mira (1995–98). Evaluation framework for interactive multimedia Information Retrieval applications. ESPRIT Working Group Number 20039.Google Scholar
  22. 22.
    Mizzaro, S. (1997). Relevance: the whole history. Journal of the American Society for Information Science, 48(9):810–832.CrossRefGoogle Scholar
  23. 23.
    Pejtersen, A. and Fidel, R. (1998). A framework for work centred evaluation and design: a case study of IR and the Web. Working paper for Mira Workshop, Grenoble, France.Google Scholar
  24. 24.
    Renyi, A. (1969). Foundations of probability. Holden-Day Press, San Francisco, USA.Google Scholar
  25. 25.
    Robertson, S. and Sparck Jones, K. (1976). Relevance weighting of search terms. Journal of the American Society for Information Science, 27:129–146.CrossRefGoogle Scholar
  26. 26.
    Salton, G. and McGill, M. (1983). Introduction to modern Information Retrieval. McGraw-Hill, New York.MATHGoogle Scholar
  27. 27.
    Shaw, W., Wood, J., Wood, R., and Tibbo, H. (1991). The Cystic Fibrosis Database: content and research opportunities. LISR, 13:347–366.Google Scholar
  28. 28.
    Turtle, H. (1990). Inference Networks for Document Retrieval. PhD Thesis, Computer and Information Science Department, University of Massachusetts, Amherst, USA.Google Scholar
  29. 29.
    van Rijsbergen, C. (1979). Information Retrieval. Butterworths, London, sec-ond edition.Google Scholar
  30. 30.
    Wilbur, W. (1998). The knowledge in multiple human relevance judgements. ACM Transactions on Information Systems, 16(2):101–126.CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2000

Authors and Affiliations

  • Gianni Amati
    • 1
  • Fabio Crestani
    • 2
  1. 1.Fondazione Ugo BordoniRomaItaly
  2. 2.Computing Science DepartmentUniversity of GlasgowGlasgowScotland

Personalised recommendations