Skip to main content

Introduction to Learning Theory

  • Chapter
  • First Online:
  • 806 Accesses

Abstract

In the supervised setting, a learning model builds a prediction function from a finite set of examples, called training base (Fukunaga, Introduction to Statistical Pattern Recognition, 1972; Duda et al., Pattern Classification, 2001; Boucheron et al., Theory of classification: A survey of some recent advances, 323–375, 2005) , where each example is a pair constituted by the vector representation of an observation and its associated desired output. The goal of learning is hence to induce a function that predicts the outputs for new observations by committing the lowest prediction error (also called generalization error). The underlying assumption here is that examples are stationary, i.e. that the examples of the training set, on which the prediction function is learned, are somehow representative of the general problem that is to be solved. In practice, from an existing function class , the learning algorithm chooses the function having the lowest empirical error over a given training set. For each observation in the training set, the error function used to estimate the empirical error, quantifies the disagreement between the prediction output provided by the function, that is to be learned, for that observation and its associated desired output ). The overall aim of this search, is not to find the function that perfectly predicts the desired output of observations in the training set (or to do overfitting ), but to find, as we have evoked, the function that has a good generalization performance. In this chapter, we present the theory of machine learning based on the framework of Vapnik, (1999). More specifically, we describe the notion of consistency which guarantees the learnability of a prediction function. Definitions and assumptions of this theory, as well as the empirical risk minimization principle are described in section 2.1. The study of the consistency of this principle, presented in section 2.2, leads us to the second principle of the learning theory, referred as structural risk minimization and which states that learning is a compromise between a low empirical risk and a high complexity of the function class in use. We particularly present two tools to measure this complexity leading to different types of generalization bounds that were the basis for the development of new learning algorithms in the recent years.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD   54.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    The opposite reasoning, called deduction , is based on axioms and consists in producing specific rules (which are always true) as consequences of the general law.

  2. 2.

    The lemma was first stated, and in a slightly different form, in Vapnik and Cervonenkis (1971).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Massih-Reza Amini .

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Amini, MR., Usunier, N. (2015). Introduction to Learning Theory. In: Learning with Partially Labeled and Interdependent Data. Springer, Cham. https://doi.org/10.1007/978-3-319-15726-9_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-15726-9_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-15725-2

  • Online ISBN: 978-3-319-15726-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics