Skip to main content

Probabilistic Learning by Uncertainty Sampling with Non-Binary Relevance

  • Chapter
Soft Computing in Information Retrieval

Part of the book series: Studies in Fuzziness and Soft Computing ((STUDFUZZ,volume 50))

Abstract

We present a learning model for probabilistic learning in information retrieval and information filtering which is based on the concept of “uncertainty sampling”. Uncertainty sampling is a technique that exploits user relevance feed-back both for relevant and non-relevant documents. In particular, relevance sampling uses those documents whose relevance is most uncertain to speed up the learning of the user relevance criteria. We extend the use of uncertainty sampling by considering multiple levels of relevance and we show how this new learning model for information retrieval and filtering could be evaluated using collections with non-binary relevance assessments.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Allan, J. (1996). Incremental relevance feedback for Information filtering. In Proceedings of ACM SIGIR, pages 270–278, Zürich, Switzerland.

    Google Scholar 

  2. Amati, G. and Crestani, F. (1999). Probabilistic learning for selective dissem-ination of information. Information Processing and Management In press.

    Google Scholar 

  3. Amati, G., Crestani, F., Ubaldini, F., and De Nardis, S. (1997). Probabilistic learning for information filtering. In Proceedings of the RIAO Conference, volume 1, pages 513–530, Montreal, Canada.

    Google Scholar 

  4. Amati, G. and van Rijsbergen, C. (1995). Probability, information and Information Retrieval. In Proceedings of the First International Workshop on Information Retrieval, Uncertanty and Logic, Glasgow, Scotland, UK.

    Google Scholar 

  5. Amati, G. and van Rijsbergen, C. (1998). Semantic Information Retrieval. In Crestani, F., Lalmas, M., and van Rijsbergen, C, editors, Information Retrieval: Uncertainty and Logics, pages 189–220. Kluwer Academic Publishers, Norwell, MA, USA.

    Chapter  Google Scholar 

  6. Belew, R. (1996). Rave reviews: acquiring relevance assessments from multiple users. In Proceedings of the AAAI Spring Symposium on Machine Learning in Information Access, Stanford, CA, USA.

    Google Scholar 

  7. Belkin, N. and Croft, W. (1992). Information Filtering and Information Retrieval: two sides of the same coin? Communications ofthe ACM, 35(12):29–38.

    Article  Google Scholar 

  8. Callan, J. (1996). Document filtering with inference networks. In Proceedings of ACM SIGIR, pages 262–269, Zürich, Switzerland.

    Google Scholar 

  9. Carnap, R. (1950). Logical Foundations of probability. Routledge and Kegan Paul Ltd, London, UK.

    MATH  Google Scholar 

  10. Cleverdon, C, Mills, J., and Keen, M. (1966). ASLIB Cranfield Research Project: factors determining the Performance of indexing Systems. ASLIB.

    Google Scholar 

  11. Cooper, W. (1971). A definition of relevance for Information Retrieval. Information Storage and Retrieval, 7:19–37.

    Article  Google Scholar 

  12. Crestani, F., Lalmas, M., van Rijsbergen, C, and Campbell, I. (1998). Is this document relevant?…probably. A survey of probabilistic models in Information Retrieval. ACM Computing Surveys, 30(4):528–552.

    Article  Google Scholar 

  13. Cuadra, C. and Katter, R. (1967). Opening the black box of relevance. Journal of Documentation, 23(4):291–303.

    Article  Google Scholar 

  14. Ghosh, G. (1991). A brief history of sequential analisys. Marcel Dekker, New York, USA.

    Google Scholar 

  15. Harman, D. (1992). Relevance feedback and other query modification tech-niques. In Frakes, W. and Baeza-Yates, R., editors, Information Retrieval: data structures and algorithms, chapter 11. Prentice Hall, Englewood Cliffs, New Jersey, USA.

    Google Scholar 

  16. Harman, D. (1996). Overview of the fifth text retrieval Conference (TREC-5). In Proceedings of the TREC Conference, Gaithersburg, MD, USA.

    Google Scholar 

  17. Harter, S. (1996). Variations in relevance assessments and the measurements of retrieval effectiveness. Journal ofthe American Society for Information Science, 47(l):37–49.

    Article  Google Scholar 

  18. Hintikka, J. (1970). On semantic information. In Information and inference. Synthese Library, Reidel, Dordrecht, The Netherlands.

    Chapter  Google Scholar 

  19. Lewis, D. (1995). A sequential algorithm for training text classifiers: corrigen-dum and additional data. SIGIR FORUM, 29(2):13–19.

    Article  Google Scholar 

  20. Lewis, D. and Gale, W. (1994). A sequential algorithm for training classifiers. In Proceedings of ACM SIGIR, pages 3–11, Dublin, Ireland.

    Google Scholar 

  21. Mira (1995–98). Evaluation framework for interactive multimedia Information Retrieval applications. ESPRIT Working Group Number 20039.

    Google Scholar 

  22. Mizzaro, S. (1997). Relevance: the whole history. Journal of the American Society for Information Science, 48(9):810–832.

    Article  Google Scholar 

  23. Pejtersen, A. and Fidel, R. (1998). A framework for work centred evaluation and design: a case study of IR and the Web. Working paper for Mira Workshop, Grenoble, France.

    Google Scholar 

  24. Renyi, A. (1969). Foundations of probability. Holden-Day Press, San Francisco, USA.

    Google Scholar 

  25. Robertson, S. and Sparck Jones, K. (1976). Relevance weighting of search terms. Journal of the American Society for Information Science, 27:129–146.

    Article  Google Scholar 

  26. Salton, G. and McGill, M. (1983). Introduction to modern Information Retrieval. McGraw-Hill, New York.

    MATH  Google Scholar 

  27. Shaw, W., Wood, J., Wood, R., and Tibbo, H. (1991). The Cystic Fibrosis Database: content and research opportunities. LISR, 13:347–366.

    Google Scholar 

  28. Turtle, H. (1990). Inference Networks for Document Retrieval. PhD Thesis, Computer and Information Science Department, University of Massachusetts, Amherst, USA.

    Google Scholar 

  29. van Rijsbergen, C. (1979). Information Retrieval. Butterworths, London, sec-ond edition.

    Google Scholar 

  30. Wilbur, W. (1998). The knowledge in multiple human relevance judgements. ACM Transactions on Information Systems, 16(2):101–126.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2000 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Amati, G., Crestani, F. (2000). Probabilistic Learning by Uncertainty Sampling with Non-Binary Relevance. In: Crestani, F., Pasi, G. (eds) Soft Computing in Information Retrieval. Studies in Fuzziness and Soft Computing, vol 50. Physica, Heidelberg. https://doi.org/10.1007/978-3-7908-1849-9_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-7908-1849-9_12

  • Publisher Name: Physica, Heidelberg

  • Print ISBN: 978-3-7908-2473-5

  • Online ISBN: 978-3-7908-1849-9

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics