Skip to main content

The Loss Rank Principle for Model Selection

  • Conference paper
Learning Theory (COLT 2007)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4539))

Included in the following conference series:

  • 3226 Accesses

Abstract

A key issue in statistics and machine learning is to automatically select the “right” model complexity, e.g. the number of neighbors to be averaged over in k nearest neighbor (kNN) regression or the polynomial degree in regression with polynomials. We suggest a novel principle (LoRP) for model selection in regression and classification. It is based on the loss rank, which counts how many other (fictitious) data would be fitted better. LoRP selects the model that has minimal loss rank. Unlike most penalized maximum likelihood variants (AIC,BIC,MDL), LoRP only depends on the regression functions and the loss function. It works without a stochastic noise model, and is directly applicable to any non-parametric regressor, like kNN.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Akaike, H.: Information theory and an extension of the maximum likelihood principle. In: Proc. 2nd International Symposium on Information Theory, pp. 267–281, Budapest, Hungary, Akademiai Kaidó (1973)

    Google Scholar 

  2. Bai, Z., Fahey, M., Golub, G.: Some large-scale matrix computation problems. Jrnl of Comp. and Applied Math. 74(1–2), 71–89 (1996)

    Article  MATH  MathSciNet  Google Scholar 

  3. Grünwald, P.D.: Tutorial on minimum description length. In: Minimum Description Length: recent advances in theory and practice, page Chapters 1 and 2. MIT Press, Cambridge (2004) http://www.cwi.nl/~pdg/ftp/mdlintro.pdf

  4. Hastie, T., Tibshirani, R., Friedman, J.H.: The Elements of Statistical Learning. Springer, Heidelberg (2001)

    MATH  Google Scholar 

  5. MacKay, D.J.C.: Bayesian interpolation. Neural Comp. 4(3), 415–447 (1992)

    Article  Google Scholar 

  6. Reusken, A.: Approximation of the determinant of large sparse symmetric positive definite matrices. SIAM Journal on Matrix Analysis and Applications 23(3), 799–818 (2002)

    Article  MATH  MathSciNet  Google Scholar 

  7. Rissanen, J.J.: Modeling by shortest data description. Automatica 14(5), 465–471 (1978)

    Article  MATH  Google Scholar 

  8. Schwarz, G.: Estimating the dimension of a model. Annals of Statistics 6(2), 461–464 (1978)

    MATH  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Nader H. Bshouty Claudio Gentile

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer Berlin Heidelberg

About this paper

Cite this paper

Hutter, M. (2007). The Loss Rank Principle for Model Selection. In: Bshouty, N.H., Gentile, C. (eds) Learning Theory. COLT 2007. Lecture Notes in Computer Science(), vol 4539. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-72927-3_42

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-72927-3_42

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-72925-9

  • Online ISBN: 978-3-540-72927-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics