# Learning Theory

**DOI:**https://doi.org/10.1007/978-1-4471-5102-9_227-1

## Introduction

How does a machine learn an abstract concept from examples? How can a machine generalize to previously unseen situations? Learning theory is the study of (formalized versions of) such questions. There are many possible ways to formulate such questions. Therefore, the focus of this entry is on one particular formalism, known as PAC (probably approximately correct) learning. It turns out that PAC learning theory is rich enough to capture intuitive notions of what learning should mean in the context of applications and, at the same time, is amenable to formal mathematical analysis. There are several precise and complete studies of PAC learning theory, many of which are cited in the bibliography. Therefore, this article is devoted to sketching some high-level ideas.

## Problem Formulation

In the PAC formalism, the starting point is the premise that there is an unknown set, say an unknown convex polygon, or an unknown half-plane. The unknown set cannot be *completely*unknown;...

## Bibliography

- Anthony M, Bartlett PL (1999) Neural network learning: theoretical foundations. Cambridge University Press, CambridgeCrossRefzbMATHGoogle Scholar
- Anthony M, Biggs N (1992) Computational learning theory. Cambridge University Press, CambridgezbMATHGoogle Scholar
- Benedek G, Itai A (1991) Learnability by fixed distributions. Theor Comput Sci 86:377–389CrossRefzbMATHMathSciNetGoogle Scholar
- Blumer A, Ehrenfeucht A, Haussler D, Warmuth M (1989) Learnability and the Vapnik-Chervonenkis dimension. J ACM 36(4):929–965CrossRefzbMATHMathSciNetGoogle Scholar
- Campi M, Vidyasagar M (2001) Learning with prior information. IEEE Trans Autom Control 46(11):1682–1695CrossRefzbMATHMathSciNetGoogle Scholar
- Devroye L, Györfi L, Lugosi G (1996) A probabilistic theory of pattern recognition. Springer, New YorkCrossRefzbMATHGoogle Scholar
- Gamarnik D (2003) Extension of the PAC framework to finite and countable Markov chains. IEEE Trans Inf Theory 49(1):338–345CrossRefzbMATHMathSciNetGoogle Scholar
- Kearns M, Vazirani U (1994) Introduction to computational learning theory. MIT, CambridgeGoogle Scholar
- Kulkarni SR, Vidyasagar M (1997) Learning decision rules under a family of probability measures. IEEE Trans Inf Theory 43(1):154–166CrossRefzbMATHMathSciNetGoogle Scholar
- Meir R (2000) Nonparametric time series prediction through adaptive model selection. Mach Learn 39(1):5–34CrossRefzbMATHGoogle Scholar
- Natarajan BK (1991) Machine learning: a theoretical approach. Morgan-Kaufmann, San MateoGoogle Scholar
- van der Vaart AW, Wallner JA (1996) Weak convergence and empirical processes. Springer, New YorkCrossRefzbMATHGoogle Scholar
- Vapnik VN (1995) The nature of statistical learning theory. Springer, New YorkCrossRefzbMATHGoogle Scholar
- Vapnik VN (1998) Statistical learning theory. Wiley, New YorkzbMATHGoogle Scholar
- Vidyasagar M (1997) A theory of learning and generalization. Springer, LondonzbMATHGoogle Scholar
- Vidyasagar M (2003) Learning and generalization: with applications to neural networks. Springer, LondonCrossRefGoogle Scholar