Learning MDP Action Models Via Discrete Mixture Trees

Wynkoop, Michael; Dietterich, Thomas

doi:10.1007/978-3-540-87481-2_39

Michael Wynkoop¹ &
Thomas Dietterich¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5212))

Included in the following conference series:

Joint European Conference on Machine Learning and Knowledge Discovery in Databases

5521 Accesses
3 Citations

Abstract

This paper addresses the problem of learning dynamic Bayesian network (DBN) models to support reinforcement learning. It focuses on learning regression tree (context-specific dependence) models of the conditional probability distributions of the DBNs. Existing algorithms rely on standard regression tree learning methods (both propositional and relational). However, such methods presume that the stochasticity in the domain can be modeled as a deterministic function with additive noise. This is inappropriate for many RL domains, where the stochasticity takes the form of stochastic choice over deterministic functions. This paper introduces a regression tree algorithm in which each leaf node is modeled as a finite mixture of deterministic functions. This mixture is approximated via a greedy set cover. Experiments on three challenging RL domains show that this approach finds trees that are more accurate and that are more likely to correctly identify the conditional dependencies in the DBNs based on small samples.

Download to read the full chapter text

Chapter PDF

Offline reinforcement learning with task hierarchies

Article 12 July 2017

Wasserstein Diversity-Enriched Regularizer for Hierarchical Reinforcement Learning

RBNets: A Reinforcement Learning Approach for Learning Bayesian Network Structure

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

5th International Planning Competition, International conference on automated planning and scheduling (2006)
Google Scholar
Blockeel, H.: Top-down induction of first order logical decision trees. Doctoral dissertation, Katholieke Universiteit Leuven (1998)
Google Scholar
Boutilier, C., Dearden, R.: Exploiting structure in policy construction. In: IJCAI 1995, pp. 1104–1111 (1995)
Google Scholar
Boutilier, C., Dearden, R., Goldszmidt, M.: Stochastic dynamic programming with factored representations. Artificial Intelligence 121, 49–107 (2000)
Article MATH MathSciNet Google Scholar
Breiman, L., Friedman, J., Olshen, R., Stone, C.: Classification and regression trees. Wadsworth Inc. (1984)
Google Scholar
Chickering, D.M., Heckerman, D., Meek, C.: A Bayesian approach to learning Bayesian networks with local structure. In: UAI 1997, pp. 80–89 (1997)
Google Scholar
Dean, T., Kanazawa, K.: A model for reasoning about persistence and causation. Computational Intelligence 5, 33–58 (1989)
Article Google Scholar
Dietterich, T.G.: Hierarchical reinforcement learning with the MAXQ value function decomposition. JAIR 13, 227–303 (2000)
MATH MathSciNet Google Scholar
Gama, J.: Functional trees. Machine Learning 55, 219–250 (2004)
Article MATH Google Scholar
Johnson, D.S.: Approximation algorithms for combinatorial problems. In: STOC 1973, pp. 38–49. ACM, New York (1973)
Chapter Google Scholar
Jonsson, A., Barto, A.G.: Causal graph based decomposition of factored MDPs. Journal of Machine Learning Research 7, 2259–2301 (2006)
MathSciNet Google Scholar
Kramer, S.: Structural regression trees. In: AAAI 1996, pp. 812–819. AAAI Press, Menlo Park (1996)
Google Scholar
Lavrac, N., Dzeroski, S.: Inductive logic programming, techniques and applications. Ellis Horwood (1994)
Google Scholar
McLachlan, G., Krishnan, T.: The EM algorithm and extensions. Wiley, New York (1997)
MATH Google Scholar
Mehta, N., Wynkoop, M., Ray, S., Tadepalli, P., Dietterich, T.: Automatic induction of maxq hierarchies. In: NIPS Workshop: Hierarchical Organization of Behavior (2007)
Google Scholar
Quinlan, J.R.: Learning with Continuous Classes. In: 5th Australian Joint Conference on Artificial Intelligence, pp. 343–348 (1992)
Google Scholar
Quinlan, J.R.: C4.5: Programs for machine learning. Morgan Kaufmann, San Francisco (1993)
Google Scholar
Slavík, P.: A tight analysis of the greedy algorithm for set cover. In: STOC 1996, pp. 435–441. ACM, New York (1996)
Chapter Google Scholar
The Wargus Team, Wargus sourceforge project (Technical Report) (2007), war-gus.sourceforge.org
Torgo, L.: Functional models for regression tree leaves. In: Proc. 14th International Conference on Machine Learning, pp. 385–393. Morgan Kaufman, San Francisco (1997)
Google Scholar
Vens, C., Ramon, J., Blockeel, H.: Remauve: A relational model tree learner. In: ILP 2006, pp. 424–438 (2006)
Google Scholar

Download references

Author information

Authors and Affiliations

Oregon State University, Corvallis, OR 97370, USA
Michael Wynkoop & Thomas Dietterich

Authors

Michael Wynkoop
View author publications
You can also search for this author in PubMed Google Scholar
Thomas Dietterich
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Walter Daelemans Bart Goethals Katharina Morik

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wynkoop, M., Dietterich, T. (2008). Learning MDP Action Models Via Discrete Mixture Trees. In: Daelemans, W., Goethals, B., Morik, K. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2008. Lecture Notes in Computer Science(), vol 5212. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-87481-2_39

Download citation

DOI: https://doi.org/10.1007/978-3-540-87481-2_39
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-87480-5
Online ISBN: 978-3-540-87481-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Learning MDP Action Models Via Discrete Mixture Trees

Abstract

Chapter PDF

Similar content being viewed by others

Offline reinforcement learning with task hierarchies

Wasserstein Diversity-Enriched Regularizer for Hierarchical Reinforcement Learning

RBNets: A Reinforcement Learning Approach for Learning Bayesian Network Structure

Keywords

References

Author information

Authors and Affiliations

Editor information

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Learning MDP Action Models Via Discrete Mixture Trees

Abstract

Chapter PDF

Similar content being viewed by others

Offline reinforcement learning with task hierarchies

Wasserstein Diversity-Enriched Regularizer for Hierarchical Reinforcement Learning

RBNets: A Reinforcement Learning Approach for Learning Bayesian Network Structure

Keywords

References

Author information

Authors and Affiliations

Editor information

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation