Encyclopedia of Systems and Control

Living Edition
| Editors: John Baillieul, Tariq Samad

Learning Theory

  • Mathukumalli VidyasagarEmail author
Living reference work entry
DOI: https://doi.org/10.1007/978-1-4471-5102-9_227-1


How does a machine learn an abstract concept from examples? How can a machine generalize to previously unseen situations? Learning theory is the study of (formalized versions of) such questions. There are many possible ways to formulate such questions. Therefore, the focus of this entry is on one particular formalism, known as PAC (probably approximately correct) learning. It turns out that PAC learning theory is rich enough to capture intuitive notions of what learning should mean in the context of applications and, at the same time, is amenable to formal mathematical analysis. There are several precise and complete studies of PAC learning theory, many of which are cited in the bibliography. Therefore, this article is devoted to sketching some high-level ideas.

Problem Formulation

In the PAC formalism, the starting point is the premise that there is an unknown set, say an unknown convex polygon, or an unknown half-plane. The unknown set cannot be completelyunknown;...

This is a preview of subscription content, log in to check access.


  1. Anthony M, Bartlett PL (1999) Neural network learning: theoretical foundations. Cambridge University Press, CambridgeCrossRefzbMATHGoogle Scholar
  2. Anthony M, Biggs N (1992) Computational learning theory. Cambridge University Press, CambridgezbMATHGoogle Scholar
  3. Benedek G, Itai A (1991) Learnability by fixed distributions. Theor Comput Sci 86:377–389CrossRefzbMATHMathSciNetGoogle Scholar
  4. Blumer A, Ehrenfeucht A, Haussler D, Warmuth M (1989) Learnability and the Vapnik-Chervonenkis dimension. J ACM 36(4):929–965CrossRefzbMATHMathSciNetGoogle Scholar
  5. Campi M, Vidyasagar M (2001) Learning with prior information. IEEE Trans Autom Control 46(11):1682–1695CrossRefzbMATHMathSciNetGoogle Scholar
  6. Devroye L, Györfi L, Lugosi G (1996) A probabilistic theory of pattern recognition. Springer, New YorkCrossRefzbMATHGoogle Scholar
  7. Gamarnik D (2003) Extension of the PAC framework to finite and countable Markov chains. IEEE Trans Inf Theory 49(1):338–345CrossRefzbMATHMathSciNetGoogle Scholar
  8. Kearns M, Vazirani U (1994) Introduction to computational learning theory. MIT, CambridgeGoogle Scholar
  9. Kulkarni SR, Vidyasagar M (1997) Learning decision rules under a family of probability measures. IEEE Trans Inf Theory 43(1):154–166CrossRefzbMATHMathSciNetGoogle Scholar
  10. Meir R (2000) Nonparametric time series prediction through adaptive model selection. Mach Learn 39(1):5–34CrossRefzbMATHGoogle Scholar
  11. Natarajan BK (1991) Machine learning: a theoretical approach. Morgan-Kaufmann, San MateoGoogle Scholar
  12. van der Vaart AW, Wallner JA (1996) Weak convergence and empirical processes. Springer, New YorkCrossRefzbMATHGoogle Scholar
  13. Vapnik VN (1995) The nature of statistical learning theory. Springer, New YorkCrossRefzbMATHGoogle Scholar
  14. Vapnik VN (1998) Statistical learning theory. Wiley, New YorkzbMATHGoogle Scholar
  15. Vidyasagar M (1997) A theory of learning and generalization. Springer, LondonzbMATHGoogle Scholar
  16. Vidyasagar M (2003) Learning and generalization: with applications to neural networks. Springer, LondonCrossRefGoogle Scholar

Copyright information

© Springer-Verlag London 2014

Authors and Affiliations

  1. 1.University of Texas at DallasRichardsonUSA