Abstract
Finding optimal controllers of stochastic systems is a particularly challenging problem tackled by the optimal control and reinforcement learning communities. A classic paradigm for handling such problems is provided by Markov Decision Processes. However, the resulting underlying optimization problem is difficult to solve. In this paper, we explore the possible use of Particle Swarm Optimization to learn optimal controllers and show through some non-trivial experiments that it is a particularly promising lead.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Baxter, J., Bartlett, P.: Direct gradient-based reinforcement learning. JAIR (1999)
Clerc, M., Kennedy, J.: The particle swarm - explosion, stability, and convergence in a multidimensional complex space. IEEE Trans. Evol. Comp. 6(1), 58–73 (2002)
Engelbrecht, A.: Fundamentals of Computational Swarm Intelligence. Wiley (2005)
Engelbrecht, A.P.: Heterogeneous Particle Swarm Optimization. In: Dorigo, M., Birattari, M., Di Caro, G.A., Doursat, R., Engelbrecht, A.P., Floreano, D., Gambardella, L.M., Groß, R., Şahin, E., Sayama, H., Stützle, T. (eds.) ANTS 2010. LNCS, vol. 6234, pp. 191–202. Springer, Heidelberg (2010)
Fix, J., Geist, M.: http://jeremy.fix.free.fr/spip.php?article33
Geist, M., Pietquin, O.: Parametric Value Function Approximation: a Unified View. In: ADPRL 2011 (2011)
Kennedy, J., Eberhart, R.: Particle swarm optimization. In: Proceedings IEEE International Joint Conference on Neural Networks, pp. 1942–1948 (1995)
Lagoudakis, M.G., Parr, R.: Least-squares policy iteration. JMLR 4 (2003)
Mannor, S., Rubinstein, R., Gat, Y.: The cross entropy method for fast policy search. In: International Conference on Machine Learning, vol. 20, p. 512 (2003)
Munos, R., Moore, A.W.: Variable resolution discretization for high-accuracy solutions of optimal control problems. In: IJCAI, pp. 1348–1355 (1999)
Spong, M.W.: The swing up control problem for the acrobot. IEEE Control Systems 15, 49–55 (1995)
Sutton, R., Barto, A.: Reinforcement Learning: An Introduction. MIT Press (1998)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Fix, J., Geist, M. (2012). Monte-Carlo Swarm Policy Search. In: Rutkowski, L., Korytkowski, M., Scherer, R., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds) Swarm and Evolutionary Computation. EC SIDE 2012 2012. Lecture Notes in Computer Science, vol 7269. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-29353-5_9
Download citation
DOI: https://doi.org/10.1007/978-3-642-29353-5_9
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-29352-8
Online ISBN: 978-3-642-29353-5
eBook Packages: Computer ScienceComputer Science (R0)