Axioms for Rational Reinforcement Learning

Sunehag, Peter; Hutter, Marcus

doi:10.1007/978-3-642-24412-4_27

Peter Sunehag²² &
Marcus Hutter²²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6925))

Included in the following conference series:

International Conference on Algorithmic Learning Theory

2751 Accesses
4 Citations

Abstract

We provide a formal, simple and intuitive theory of rational decision making including sequential decisions that affect the environment. The theory has a geometric flavor, which makes the arguments easy to visualize and understand. Our theory is for complete decision makers, which means that they have a complete set of preferences. Our main result shows that a complete rational decision maker implicitly has a probabilistic model of the environment. We have a countable version of this result that brings light on the issue of countable vs finite additivity by showing how it depends on the geometry of the space which we have preferences over. This is achieved through fruitfully connecting rationality with the Hahn-Banach Theorem. The theory presented here can be viewed as a formalization and extension of the betting odds approach to probability of Ramsey and De Finetti [Ram31, deF37].

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Allais, M.: Le comportement de l’homme rationnel devant le risque: Critique des postulats et axiomes de l’ecole americaine. Econometrica 21(4), 503–546 (1953)
Article MathSciNet MATH Google Scholar
Arrow, K.: Essays in the Theory of Risk-Bearing. North-Holland, Amsterdam (1970)
MATH Google Scholar
Cox, R.T.: Probability, frequency and reasonable expectation. Am. Jour. Phys 14, 1–13 (1946)
Article MathSciNet MATH Google Scholar
de Finetti, B.: La prévision: Ses lois logiques, ses sources subjectives. In: Annales de l’Institut Henri Poincaré, Paris, vol. 7, pp. 1–68 (1937)
Google Scholar
Diestel, J.: Sequences and series in \(\text{Banach}\) spaces. Springer, Heidelberg (1984)
Book Google Scholar
Ellsberg, D.: Risk, Ambiguity, and the Savage Axioms. The Quarterly Journal of Economics 75(4), 643–669 (1961)
Article MATH Google Scholar
Halpern, J.Y.: A counterexample to theorems of Cox and Fine. Journal of AI research 10, 67–85 (1999)
MathSciNet MATH Google Scholar
Hutter, M.: Universal Artificial Intelligence: Sequential Decisions based on Algorithmic Probability. Springer, Berlin (2005)
Book MATH Google Scholar
Jaynes, E.T.: Probability theory: the logic of science. Cambridge University Press, Cambridge (2003)
Book MATH Google Scholar
Kreyszig, E.: Introductory Functional Analysis With Applications. Wiley, Chichester (1989)
MATH Google Scholar
Naricia, L., Beckenstein, E.: The Hahn-Banach theorem: the life and times. Topology and its Applications 77(2), 193–211 (1997)
Article MathSciNet MATH Google Scholar
Neumann, J., Morgenstern, O.: Theory of Games and Economic Behavior. Princeton University Press, Princeton (1944)
MATH Google Scholar
Paris, J.B.: The uncertain reasoner’s companion: a mathematical perspective. Cambridge University Press, New York (1994)
MATH Google Scholar
Ramsey, F.P.: Truth and probability. In: Braithwaite, R.B. (ed.) The Foundations of Mathematics and other Logical Essays, ch.7, pp. 156–198. Brace & Co. (1931)
Google Scholar
Savage, L.: The Foundations of Statistics. Wiley, New York (1954)
MATH Google Scholar
Sugden, R.: Rational choice: A survey of contributions from economics and philosophy. Economic Journal 101(407), 751–785 (1991)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Research School of Computer Science, Australian National University, Canberra, ACT, 0200, Australia
Peter Sunehag & Marcus Hutter

Authors

Peter Sunehag
View author publications
You can also search for this author in PubMed Google Scholar
Marcus Hutter
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, University of Helsinki, (Gustaf Hällströmin katu 2b), P.O. Box 68, 00014, Helsinki, Finland
Jyrki Kivinen & Esko Ukkonen &
Department of Computing Science, University of Alberta, T6G 2E8, Edmonton, AB, Canada
Csaba Szepesvári
Division of Computer Science, Hokkaido University, N-14, W-9, 060-0814, Sapporo, Japan
Thomas Zeugmann

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sunehag, P., Hutter, M. (2011). Axioms for Rational Reinforcement Learning. In: Kivinen, J., Szepesvári, C., Ukkonen, E., Zeugmann, T. (eds) Algorithmic Learning Theory. ALT 2011. Lecture Notes in Computer Science(), vol 6925. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-24412-4_27

Download citation

DOI: https://doi.org/10.1007/978-3-642-24412-4_27
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-24411-7
Online ISBN: 978-3-642-24412-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics