Abstract
A rather general approach to learning control is the framework of Reinforcement Learning, described in this chapter. Reinforcement learning offers one of the most general framework to take traditional robotics towards true autonomy and versatility. Single robot reinforcement learning as well as Multi-robot reinforcement learning are a very challenging areas due to several issues, such as large state spaces, difficulty in reward assignment, nondeterministic action selections, and difficulty in merging learned experiences from other robots. There are still many difficulties in application iof robotics reinforcement learning and in scaling up the multi agent reinforcement learning to multi-robot systems. After reviewing important approaches in this field, some problems and promising research directions will be given.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. The MIT Press, Cambridge (1998)
Bertsekas, D.P., Tsitsiklis, J.N.: Neuro-Dynamic Programming. Athena Scientific, Belmont (1996)
Watkins, C.J.C.H., Dayan, P.: Q Learning. Machine Learning, 279–292 (1992)
Benbrahim, H., Franklin, J.A.: Biped Dynamic Walking using Reinforcement Learning. Robotics and Autonomous Systems 22, 283–302 (1997)
Nguyen-Tuong, D., Peters, J.: Local Gaussian Process Regression for Real-time Model-based Robot Control. In: Proc. of IEEE/RSJ International Conference on Intelligent Robots and Systems, Nice, France (2008)
Peters, J., Schaal, S.: Learning to Control in Operational Space. International Journal of Robotics Research 27, 197–212 (2008)
Bakker, B., Zhumatiy, V., Gruener, G., Schmidhuber, J.: A Robot that Reinforcement-Learns to Identify and Memorize Important Previous Observations. In: Proc. of the IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 430–435 (2003)
Bakker, B., Schmidhuber, J.: Hierarchical Reinforcement Learning Based on Automatic Discovery of Subgoals and Specialization of Subpolicies. In: Proc. of the 2003 European Workshop on Reinforcement Learning, Nancy, France (2003)
Mori, T., Nakamura, Y., Sato, M., Ishii, S.: Reinforcement Learning for a CPG-driven Biped Robot. In: Proc. of the Nineteenth National Conference on Artificial Intelligence (AAAI), pp. 623–630 (2004)
Nakamura, Y., Sato, M., Ishii, S.: Reinforcement Learning for Biped Robot. In: Proc. of International Symposium on Adaptive Motion of Animals and Machines (2003)
Peters, J., Vijayakumar, S.M., Schaal, S.: Reinforcement Learning for Humanoid Robotics. In: Proc. of Third IEEE-RAS International Conference on Humanoid Robots, Karlsruhe, Germany (2003)
Tedrake, R., Zhang, T.W., Seung, H.S.: Stochastic Policy Gradient Reinforcement Learning on a Simple 3d Biped. In: Proc. of the 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (2004)
Endo, G., Morimoto, J., Matsubara, T., Nakanishi, J., Cheng, G.: Learning CPG-based Biped Locomotion with a Policy Gradient Method: Application to a Humanoid Robot. International Journal of Robotics Research 27, 213–228 (2008)
Lee, J., Oh, J.H.: Walking Pattern Generation for Planar Biped Walking Using Q-learning. In: Proc. of the 17th World Congress The International Federation of Automatic Control, Seoul, Korea, pp. 3027–3032 (2008)
Shibata, T., Hitomoi, K., Nakamura, Y., Ishii, S.: Reinforcement Learning of Stable Trajectory for Quasi-Passive Dynamic Walking of an Unstable Biped Robot. In: Hackel, M. (ed.) Humanoid Robots: Human-like Machines, Itech, Vienna, Austria, pp. 211–226 (2007)
Katić, D., Vukobratović, M.: Reinforcement Learning Algorithms in Humanoid Robotics. In: de Pina Filho, A.C. (ed.) Humanoid Robots: New Developments, Advanced Robotic Systems International and I-Tech, Vienna, pp. 367–400 (2007)
Katic, D., Rodic, A., Vukobratovic, M.: Hybrid Dynamic Control Algorithm For Humanoid Robots Based on Reinforcement Learning. J. of Intelligent and Robotic Systems 51, 3–30 (2008)
Katic, D., Rodić, A.: Dynamic Control Algorithm for Biped Walking Based on Policy Gradient Fuzzy Reinforcement Learning. In: Proc. of the 17th IFAC World Congress, Seoul, Republic of Corea (2008)
Nakanishi, J., Morimoto, J., Endo, G., Cheng, G., Schaal, S., Kawato, M.: A Framework for Learning Biped Locomotion with Dynamic Movement Primitives. In: Proc. of IEEE-RAS/RSJ International Conference on Humanoid Robots, Los Angeles, USA (2004)
Peters, J., Schaal, S.: Policy Gradient Methods for Robotics. In: Proc. of the IEEE International Conference on Intelligent Robotics Systems, Beijing, China (2006)
Parker, L.E.: Distributed Intelligence: Overview of the Field and its Application in Multi-Robot Systems. J. of Physical Agents 2, 5–14 (2008)
Yang, E., Gu, D.: Multiagent Reinforcement Learning for Multi-Robot Systems: A Survey. Technical Report CSM-404, Department of Computer Science, University of Essex (2004)
Park, K.H., Kim, Y.J., Kim, J.H.: Modular Q-Learning-based Multi-Agent Cooperation for Robot Soccer. Robotics and Autonomous Systems 35, 109–122 (2001)
Touzet, C.F.: Distributed Lazy Q-Learning for Cooperativemobile Robots. International Journal of Advanced Robotic Systems 1, 5–13 (2004)
Stone, P., Veloso, M.: Multiagent Systems: a Survey from a Machine Learning Perspective. Autonomous Robots 8, 345–383 (2000)
Mataric, M.J.: Reinforcement Learning in the Multi-Robot Domain. Autonomous Robots 4, 73–83 (1997)
Mataric, M.J.: Learning in Behavior-based Multi-Robot Systems: Policies, Models, and Other Agents. J. of Cognitive Systems Research 2, 81–93 (2001)
Gultekin, I., Arslan, A.: Modular-Fuzzy Cooperative Algorithm for Multi-Agent Systems. In: Yakhno, T. (ed.) ADVIS 2002. LNCS, vol. 2457, pp. 255–263. Springer, Heidelberg (2002)
Guo, H., Meng, Y.: Dynamic Correlation Matrix-based Multi-Q Learning for a Multi-Robot System. In: Proc. of IEEE/RSJ International Conference on Intelligent Robots and Systems, Nice, France, pp. 840–845 (2008)
Melo, F.S., Ribeiro, M.I.: Reinforcement Learning with Function Approximation for Cooperative Navigation Tasks. In: Proc. of the 2008 IEEE International Conference on Robotics and Automation, Pasadena, USA, pp. 3321–3327 (2008)
Sanz, Y., de Lope, J., Martín, J.A.H.: Applying Reinforcement Learing to Multi-Robot Team Coordination. In: Corchado, E., Abraham, A., Pedrycz, W. (eds.) HAIS 2008. LNCS, vol. 5271, pp. 625–632. Springer, Heidelberg (2008)
Tu, J.: Continuous Reinforcement Learning for Feedback Control Systems. Master’s thesis, Computer Science Department, Colorado State University, Fort Collins, USA (2001)
Rodić, A., Vukobratović, M., Addi, K., Dalleau, G.: Contribution to the Modeling of Non-smooth, Multi-point Contact Dynamics of Biped Locomotion – Theory and Experiments. Robotica 26, 157–175 (2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Katić, D. (2009). New Trends in Robotic Reinforcement Learning: Single and Multi-robot Case. In: Rudas, I.J., Fodor, J., Kacprzyk, J. (eds) Towards Intelligent Engineering and Information Technology. Studies in Computational Intelligence, vol 243. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-03737-5_18
Download citation
DOI: https://doi.org/10.1007/978-3-642-03737-5_18
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-03736-8
Online ISBN: 978-3-642-03737-5
eBook Packages: EngineeringEngineering (R0)