Batch Reinforcement Learning for Controlling a Mobile Wheeled Pendulum Robot

Bonarini, Andrea; Caccia, Claudio; Lazaric, Alessandro; Restelli, Marcello

doi:10.1007/978-0-387-09695-7_15

Batch Reinforcement Learning for Controlling a Mobile Wheeled Pendulum Robot

Andrea Bonarini²,
Claudio Caccia³,
Alessandro Lazaric² &
…
Marcello Restelli²

Conference paper

1324 Accesses
7 Citations

Part of the book series: IFIP – The International Federation for Information Processing ((IFIPAICT,volume 276))

Abstract

In this paper we present an application of Reinforcement Learning (RL) methods in the field of robot control. The main objective is to analyze the behavior of batch RL algorithms when applied to a mobile robot of the kind called Mobile Wheeled Pendulum (MWP). In this paper we focus on the common problem in classical control theory of following a reference state (e.g., position set point) and try to solve it by RL. In this case, the state space of the robot has one more dimension, in order to represent the desired variable state, while the cost function is evaluated considering the difference between the state and the reference. Within this framework some interesting aspects arise, like the ability of the RL algorithm to generalize to reference points never considered during the training phase. The performance of the learning method has been empirically analyzed and, when possible, compared to a classic control algorithm, namely linear quadratic optimal control (LQR).

Download to read the full chapter text

Chapter PDF

References

Baird, L.C.: Residual algorithms: Reinforcement learning with function approximation. In: Proceedings of the 12th Intl. Conf. on Machine Learning, pp. 30-37 (1995)
Google Scholar
Ernst, D., Geurts, P., Wehenkel, L.: Tree-based batch mode reinforcement learning. Journal of Machine Learning Research 6, 503-556 (2005)
MathSciNet Google Scholar
Gordon, G.J.: Approximate solutions to markov decision processes. Ph.D. thesis, Carnegie Mellon University (1999)
Google Scholar
Landau, L., Lifshitz, E.M.: Mechanics, Course of Theoretical Physics, Volume 1. Pergamon Pres (1976)
Google Scholar
Ogata, K.: Modern Control Engineering (4th ed.). Prentice Hall PTR, Upper Saddle River, NJ, USA (2001)
Google Scholar
Ormoneit, D., Sen, S.: Kernel-based reinforcement learning. Machine Learning 49(2-3), 161-178 (2002)
Google Scholar
Press, W.H., Flannery, B.P., Teukolsky, S.A., Vetterling, W.T.: Numerical Recipes: The Art of Scientific Computing. Cambridge Univ. Press, New York, 1989. (1989)
Google Scholar
Reddy, J.: Energy Principles and Variational Methods in Applied Mechanics (2nd ed.). John Wiley and Sons, Hoboken, NJ, USA (2002)
Google Scholar
Riedmiller, M.: Neural fitted q iteration - first experiences with a data efficient neural reinforcement learning method. In: Proceedings of European Conference on Machine Learning, pp. 317-328 (2005)
Google Scholar
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge, MA (1998)
Google Scholar

Download references

Author information

Authors and Affiliations

Dept. of Electronics and Information, Politecnico di Milano, Piazza Leonardo da Vinci 32, I-20133, Milan, Italy
Andrea Bonarini, Alessandro Lazaric & Marcello Restelli
Dept. of Informatics, Systems and Communication, Università degli Studi di Milano – Bicocca, Viale Sarca 336, I-20126, Milan, Italy
Claudio Caccia

Authors

Andrea Bonarini
View author publications
You can also search for this author in PubMed Google Scholar
Claudio Caccia
View author publications
You can also search for this author in PubMed Google Scholar
Alessandro Lazaric
View author publications
You can also search for this author in PubMed Google Scholar
Marcello Restelli
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

University of Portsmouth, UK
Max Bramer

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Bonarini, A., Caccia, C., Lazaric, A., Restelli, M. (2008). Batch Reinforcement Learning for Controlling a Mobile Wheeled Pendulum Robot. In: Bramer, M. (eds) Artificial Intelligence in Theory and Practice II. IFIP AI 2008. IFIP – The International Federation for Information Processing, vol 276. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-09695-7_15

Download citation

DOI: https://doi.org/10.1007/978-0-387-09695-7_15
Publisher Name: Springer, Boston, MA
Print ISBN: 978-0-387-09694-0
Online ISBN: 978-0-387-09695-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics