Probabilistic Learning by Uncertainty Sampling with Non-Binary Relevance

Amati, Gianni; Crestani, Fabio

doi:10.1007/978-3-7908-1849-9_12

Gianni Amati⁴ &
Fabio Crestani⁵

Part of the book series: Studies in Fuzziness and Soft Computing ((STUDFUZZ,volume 50))

221 Accesses
1 Citations

Abstract

We present a learning model for probabilistic learning in information retrieval and information filtering which is based on the concept of “uncertainty sampling”. Uncertainty sampling is a technique that exploits user relevance feed-back both for relevant and non-relevant documents. In particular, relevance sampling uses those documents whose relevance is most uncertain to speed up the learning of the user relevance criteria. We extend the use of uncertainty sampling by considering multiple levels of relevance and we show how this new learning model for information retrieval and filtering could be evaluated using collections with non-binary relevance assessments.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Allan, J. (1996). Incremental relevance feedback for Information filtering. In Proceedings of ACM SIGIR, pages 270–278, Zürich, Switzerland.
Google Scholar
Amati, G. and Crestani, F. (1999). Probabilistic learning for selective dissem-ination of information. Information Processing and Management In press.
Google Scholar
Amati, G., Crestani, F., Ubaldini, F., and De Nardis, S. (1997). Probabilistic learning for information filtering. In Proceedings of the RIAO Conference, volume 1, pages 513–530, Montreal, Canada.
Google Scholar
Amati, G. and van Rijsbergen, C. (1995). Probability, information and Information Retrieval. In Proceedings of the First International Workshop on Information Retrieval, Uncertanty and Logic, Glasgow, Scotland, UK.
Google Scholar
Amati, G. and van Rijsbergen, C. (1998). Semantic Information Retrieval. In Crestani, F., Lalmas, M., and van Rijsbergen, C, editors, Information Retrieval: Uncertainty and Logics, pages 189–220. Kluwer Academic Publishers, Norwell, MA, USA.
Chapter Google Scholar
Belew, R. (1996). Rave reviews: acquiring relevance assessments from multiple users. In Proceedings of the AAAI Spring Symposium on Machine Learning in Information Access, Stanford, CA, USA.
Google Scholar
Belkin, N. and Croft, W. (1992). Information Filtering and Information Retrieval: two sides of the same coin? Communications ofthe ACM, 35(12):29–38.
Article Google Scholar
Callan, J. (1996). Document filtering with inference networks. In Proceedings of ACM SIGIR, pages 262–269, Zürich, Switzerland.
Google Scholar
Carnap, R. (1950). Logical Foundations of probability. Routledge and Kegan Paul Ltd, London, UK.
MATH Google Scholar
Cleverdon, C, Mills, J., and Keen, M. (1966). ASLIB Cranfield Research Project: factors determining the Performance of indexing Systems. ASLIB.
Google Scholar
Cooper, W. (1971). A definition of relevance for Information Retrieval. Information Storage and Retrieval, 7:19–37.
Article Google Scholar
Crestani, F., Lalmas, M., van Rijsbergen, C, and Campbell, I. (1998). Is this document relevant?…probably. A survey of probabilistic models in Information Retrieval. ACM Computing Surveys, 30(4):528–552.
Article Google Scholar
Cuadra, C. and Katter, R. (1967). Opening the black box of relevance. Journal of Documentation, 23(4):291–303.
Article Google Scholar
Ghosh, G. (1991). A brief history of sequential analisys. Marcel Dekker, New York, USA.
Google Scholar
Harman, D. (1992). Relevance feedback and other query modification tech-niques. In Frakes, W. and Baeza-Yates, R., editors, Information Retrieval: data structures and algorithms, chapter 11. Prentice Hall, Englewood Cliffs, New Jersey, USA.
Google Scholar
Harman, D. (1996). Overview of the fifth text retrieval Conference (TREC-5). In Proceedings of the TREC Conference, Gaithersburg, MD, USA.
Google Scholar
Harter, S. (1996). Variations in relevance assessments and the measurements of retrieval effectiveness. Journal ofthe American Society for Information Science, 47(l):37–49.
Article Google Scholar
Hintikka, J. (1970). On semantic information. In Information and inference. Synthese Library, Reidel, Dordrecht, The Netherlands.
Chapter Google Scholar
Lewis, D. (1995). A sequential algorithm for training text classifiers: corrigen-dum and additional data. SIGIR FORUM, 29(2):13–19.
Article Google Scholar
Lewis, D. and Gale, W. (1994). A sequential algorithm for training classifiers. In Proceedings of ACM SIGIR, pages 3–11, Dublin, Ireland.
Google Scholar
Mira (1995–98). Evaluation framework for interactive multimedia Information Retrieval applications. ESPRIT Working Group Number 20039.
Google Scholar
Mizzaro, S. (1997). Relevance: the whole history. Journal of the American Society for Information Science, 48(9):810–832.
Article Google Scholar
Pejtersen, A. and Fidel, R. (1998). A framework for work centred evaluation and design: a case study of IR and the Web. Working paper for Mira Workshop, Grenoble, France.
Google Scholar
Renyi, A. (1969). Foundations of probability. Holden-Day Press, San Francisco, USA.
Google Scholar
Robertson, S. and Sparck Jones, K. (1976). Relevance weighting of search terms. Journal of the American Society for Information Science, 27:129–146.
Article Google Scholar
Salton, G. and McGill, M. (1983). Introduction to modern Information Retrieval. McGraw-Hill, New York.
MATH Google Scholar
Shaw, W., Wood, J., Wood, R., and Tibbo, H. (1991). The Cystic Fibrosis Database: content and research opportunities. LISR, 13:347–366.
Google Scholar
Turtle, H. (1990). Inference Networks for Document Retrieval. PhD Thesis, Computer and Information Science Department, University of Massachusetts, Amherst, USA.
Google Scholar
van Rijsbergen, C. (1979). Information Retrieval. Butterworths, London, sec-ond edition.
Google Scholar
Wilbur, W. (1998). The knowledge in multiple human relevance judgements. ACM Transactions on Information Systems, 16(2):101–126.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Fondazione Ugo Bordoni, via B. Castiglione, 59, Roma, Italy
Gianni Amati
Computing Science Department, University of Glasgow, Glasgow, G12 8QQ, Scotland
Fabio Crestani

Authors

Gianni Amati
View author publications
You can also search for this author in PubMed Google Scholar
Fabio Crestani
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computing Science, University of Glasgow, G128QQ, Glasgow, Scotland
Fabio Crestani
ITIM-CNR, Via Ampère 56, 20131, Milano, Italy
Gabriella Pasi

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Amati, G., Crestani, F. (2000). Probabilistic Learning by Uncertainty Sampling with Non-Binary Relevance. In: Crestani, F., Pasi, G. (eds) Soft Computing in Information Retrieval. Studies in Fuzziness and Soft Computing, vol 50. Physica, Heidelberg. https://doi.org/10.1007/978-3-7908-1849-9_12

Download citation

DOI: https://doi.org/10.1007/978-3-7908-1849-9_12
Publisher Name: Physica, Heidelberg
Print ISBN: 978-3-7908-2473-5
Online ISBN: 978-3-7908-1849-9
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics