Supervised and Semi-supervised Machine Learning Ranking

Vittaut, Jean-Noël; Gallinari, Patrick

doi:10.1007/978-3-540-73888-6_21

Jean-Noël Vittaut¹ &
Patrick Gallinari¹

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4518))

Included in the following conference series:

International Workshop of the Initiative for the Evaluation of XML Retrieval

663 Accesses
1 Citations

Abstract

We present a Semi-supervised Machine Learning based ranking model which can automatically learn its parameters using a training set of a few labeled and unlabeled examples composed of queries and relevance judgments on a subset of the document elements. Our model improves the performance of a baseline Information Retrieval system by optimizing a ranking loss criterion and combining scores computed from doxels and from their local structural context. We analyze the performance of our supervised and semi-supervised algorithms on CO-Focussed and CO-Thourough tasks using a baseline model which is an adaptation of Okapi to Structured Information Retrieval.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Amini, M.R., Usunier, N., Gallinari, P.: Automatic text summarization based on word-clusters and ranking algorithms. In: Losada, D.E., Fernández-Luna, J.M. (eds.) ECIR 2005. LNCS, vol. 3408, pp. 142–156. Springer, Heidelberg (2005)
Google Scholar
Cohen, W.W., Schapire, R.E., Singer, Y.: Learning to order things. In: Jordan, M.I., Kearns, M.J., Solla, S.A. (eds.) Advances in Neural Information Processing Systems, vol. 10, The MIT Press, Cambridge, MA (1998)
Google Scholar
Bartell, B.T., Cottrell, G.W., Belew, R.K.: Automatic combination of multiple ranked retrieval systems. In: Research and Development in Information Retrieval, pp. 173–181 (1994)
Google Scholar
Clémençon, S., Lugosi, G., Vayatis, N.: Ranking and scoring using empirical risk minimization. In: Auer, P., Meir, R. (eds.) COLT 2005. LNCS (LNAI), vol. 3559, pp. 1–15. Springer, Heidelberg (2005)
Google Scholar
Craswell, N., Robertson, S., Zaragoza, H., Taylor, M.: Relevance weighting for query independent evidence. In: SIGIR 2005. Proceedings of the 28th annual international ACM SIGIR conference, ACM Press, New York (2005)
Google Scholar
Dempster, A., Laird, N., Rubin, D.: Maximum likelihood from incomplete data via EM algorithm. Journal of the Royal Statistical Society B(39), 1–38 (1977)
MathSciNet Google Scholar
Denoyer, L., Gallinari, P.: The Wikipedia XML Corpus SIGIR Forum (2006)
Google Scholar
Freund, Y., Iyer, R., Schapire, R.E., Singer, Y.: An efficient boosting algorithm for combining preferences. In: Proceedings of ICML 1998. 15th International Conference on Machine Learning (1998)
Google Scholar
Miller, D., Uyar, H.: A Mixture of Experts classifier with learning based on both labeled and unlabeled data. Advances in Neural Information Processing Systems 9, 571–577 (1996)
Google Scholar
Nigam, K., McCallum, A.K., Thrun, S., Mitchell, T.: Text Classification from Labeled and Unlabeled Documents using EM. In: Proceedings of National Conference on Artificial Intel-ligence (1998)
Google Scholar
Robertson, S.E., Walker, S., Hancock-Beaulieu, M., Gull, A., Lau, M.: Okapi at TREC. In: Text REtrieval Conference, pp. 21–30 ( 1992)
Google Scholar
Vittaut, J.N., Piwowarski, B., Gallinari, P.: An algebra for structured queries in bayesian networks. In: Fuhr, N., Lalmas, M., Malik, S., Szlávik, Z. (eds.) INEX 2004. LNCS, vol. 3493, Springer, Heidelberg (2005)
Google Scholar
Vittaut, J.N., Amini, M.R., Gallinari, P.: Learning Classification with Both Labeled and Unlabeled Data. In: Elomaa, T., Mannila, H., Toivonen, H. (eds.) ECML 2002. LNCS (LNAI), vol. 2430, Springer, Heidelberg (2002)
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Laboratoire d’Informatique de Paris 6, 104, avenue du Président-Kennedy, F-75016 Paris, France
Jean-Noël Vittaut & Patrick Gallinari

Authors

Jean-Noël Vittaut
View author publications
You can also search for this author in PubMed Google Scholar
Patrick Gallinari
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Norbert Fuhr Mounia Lalmas Andrew Trotman

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Vittaut, JN., Gallinari, P. (2007). Supervised and Semi-supervised Machine Learning Ranking. In: Fuhr, N., Lalmas, M., Trotman, A. (eds) Comparative Evaluation of XML Information Retrieval Systems. INEX 2006. Lecture Notes in Computer Science, vol 4518. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-73888-6_21

Download citation

DOI: https://doi.org/10.1007/978-3-540-73888-6_21
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-73887-9
Online ISBN: 978-3-540-73888-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics