The Loss Rank Principle for Model Selection

Hutter, Marcus

doi:10.1007/978-3-540-72927-3_42

Marcus Hutter¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4539))

Included in the following conference series:

International Conference on Computational Learning Theory

3226 Accesses

Abstract

A key issue in statistics and machine learning is to automatically select the “right” model complexity, e.g. the number of neighbors to be averaged over in k nearest neighbor (kNN) regression or the polynomial degree in regression with polynomials. We suggest a novel principle (LoRP) for model selection in regression and classification. It is based on the loss rank, which counts how many other (fictitious) data would be fitted better. LoRP selects the model that has minimal loss rank. Unlike most penalized maximum likelihood variants (AIC,BIC,MDL), LoRP only depends on the regression functions and the loss function. It works without a stochastic noise model, and is directly applicable to any non-parametric regressor, like kNN.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Akaike, H.: Information theory and an extension of the maximum likelihood principle. In: Proc. 2nd International Symposium on Information Theory, pp. 267–281, Budapest, Hungary, Akademiai Kaidó (1973)
Google Scholar
Bai, Z., Fahey, M., Golub, G.: Some large-scale matrix computation problems. Jrnl of Comp. and Applied Math. 74(1–2), 71–89 (1996)
Article MATH MathSciNet Google Scholar
Grünwald, P.D.: Tutorial on minimum description length. In: Minimum Description Length: recent advances in theory and practice, page Chapters 1 and 2. MIT Press, Cambridge (2004) http://www.cwi.nl/~pdg/ftp/mdlintro.pdf
Hastie, T., Tibshirani, R., Friedman, J.H.: The Elements of Statistical Learning. Springer, Heidelberg (2001)
MATH Google Scholar
MacKay, D.J.C.: Bayesian interpolation. Neural Comp. 4(3), 415–447 (1992)
Article Google Scholar
Reusken, A.: Approximation of the determinant of large sparse symmetric positive definite matrices. SIAM Journal on Matrix Analysis and Applications 23(3), 799–818 (2002)
Article MATH MathSciNet Google Scholar
Rissanen, J.J.: Modeling by shortest data description. Automatica 14(5), 465–471 (1978)
Article MATH Google Scholar
Schwarz, G.: Estimating the dimension of a model. Annals of Statistics 6(2), 461–464 (1978)
MATH MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

RSISE @ ANU and SML @ NICTA, Canberra, ACT, 0200, Australia
Marcus Hutter

Authors

Marcus Hutter
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Nader H. Bshouty Claudio Gentile

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hutter, M. (2007). The Loss Rank Principle for Model Selection. In: Bshouty, N.H., Gentile, C. (eds) Learning Theory. COLT 2007. Lecture Notes in Computer Science(), vol 4539. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-72927-3_42

Download citation

DOI: https://doi.org/10.1007/978-3-540-72927-3_42
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-72925-9
Online ISBN: 978-3-540-72927-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics