Skip to main content

Making a Robot Learn to Play Soccer Using Reward and Punishment

  • Conference paper
KI 2007: Advances in Artificial Intelligence (KI 2007)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4667))

Included in the following conference series:

Abstract

In this paper, we show how reinforcement learning can be applied to real robots to achieve optimal robot behavior. As example, we enable an autonomous soccer robot to learn intercepting a rolling ball. Main focus is on how to adapt the Q-learning algorithm to the needs of learning strategies for real robots and how to transfer strategies learned in simulation onto real robots.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Asada, M., Noda, S., Tawaratsumida, S., Hosoda, K.: Vision-based reinforcement learning for purposive behavior acquisition. In: Proc. of IEEE Int. Conf. on Robotics and Automation, pp. 146–153. IEEE Computer Society Press, Los Alamitos (1995)

    Google Scholar 

  2. Baird, L.C.: Residual algorithms: Reinforcement learning with function approximation. In: Proceedings of the 12th International Conference on Machine Learning, pp. 30–37 (1995)

    Google Scholar 

  3. Behnke, S., Egorova, A., Gloye, A., Rojas, R., Simon, M.: Predicting away robot control latency. In: Polani, D., Browning, B., Bonarini, A., Yoshida, K. (eds.) RoboCup 2003. LNCS (LNAI), vol. 3020, pp. 712–719. Springer, Heidelberg (2004)

    Google Scholar 

  4. Bertsekas, D.P., Tsitsiklis, J.N.: Neuro-Dynamic Programming. Athena Scientific (1996)

    Google Scholar 

  5. Gabel, T., Hafner, R., Lange, S., Lauer, M., Riedmiller, M.: Bridging the gap: Learning in the robocup simulation and midsize league. In: Controlo 2006. Proc. 7th Portuguese Conference on Automatic Control (2006)

    Google Scholar 

  6. Gabel, T., Riedmiller, M.: Learning a partial behavior for a competitive robotic soccer agent. Künstliche Intelligenz 20(2), 18–23 (2006)

    Google Scholar 

  7. Hafner, R., Lange, S., Lauer, M., Riedmiller, M.: Brainstormers Tribots team description. In: Lakemeyer, G., Sklar, E., Sorrenti, D.G., Takahashi, T. (eds.) RoboCup-2006. LNCS(LNAI), vol. 4434, Springer, Heidelberg (2006)

    Google Scholar 

  8. Howard, R.A.: Dynamic programming and Markov processes. MIT Press, Cambridge (1960)

    MATH  Google Scholar 

  9. Kitano, H., Asada, M., Kuniyoshi, Y., Noda, I., Osawa, E., Matsubara, H.: RoboCup: A challenge problem for AI. AI Magazine 18(1), 73–85 (1997)

    Google Scholar 

  10. Lauer, M.: Ego-motion estimation and collision detection for omnidirectional robots. In: Lakemeyer, G., Sklar, E., Sorrenti, D.G., Takahashi, T. (eds.) RoboCup-2006. LNCS(LNAI), vol. 4434, Springer, Heidelberg (2006)

    Google Scholar 

  11. Lauer, M., Lange, S., Riedmiller, M.: Motion estimation of moving objects for autonomous mobile robots. Künstliche Intelligenz 20(1), 11–17 (2006)

    Google Scholar 

  12. Merke, A., Schoknecht, R.: A necessary condition of convergence for reinforcement learning with function approximation. In: Proceedings of the 19th International Conference on Machine Learning, pp. 411–418 (2002)

    Google Scholar 

  13. Munos, R., Moore, A.: Variable resolution discretization for high-accuracy solutions of optimal control problems. In: International Joint Conferenece on Artificial Intelligence, pp. 1348–1355 (1999)

    Google Scholar 

  14. Pareigis, S.: Adaptive choice of grid and time in reinforcement learning. Advances inNeural Information Processing Systems 10, 1036–1042 (1997)

    Google Scholar 

  15. Schoknecht, R., Merke, A.: Convergent combinations of reinforcement learning with linear function approximation. Advances in Neural Information Processing Systems 15 (2003)

    Google Scholar 

  16. Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998)

    Google Scholar 

  17. Suzuki, S., Kato, T., Asada, M., Hosoda, K.: Behavior learning for a mobile robot with omnidirectional vision enhanced by an active zoom mechanism. In: IAS-5. Proc. of Intelligent Autonomous System 5, pp. 242–249 (1998)

    Google Scholar 

  18. Tsitsiklis, J.N., Van Roy, B.: Analysis of temporal-diffference learning with function approximation. In: Advances in Neural Information Processing Systems 1996, pp. 1075–1081 (1996)

    Google Scholar 

  19. Uchibe, E., Asada, M., Hosoda, K.: Behavior learning for a mobile robot with omnidirectional vision enhanced by an active zoom mechanism. In: Birk, A., Demiris, J. (eds.) Learning Robots. LNCS (LNAI), vol. 1545, Springer, Heidelberg (1998)

    Chapter  Google Scholar 

  20. Watkins, C.J.C.H., Dayan, P.: Q-learning. Machine Learning 8, 279–292 (1992)

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Joachim Hertzberg Michael Beetz Roman Englert

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Müller, H., Lauer, M., Hafner, R., Lange, S., Merke, A., Riedmiller, M. (2007). Making a Robot Learn to Play Soccer Using Reward and Punishment. In: Hertzberg, J., Beetz, M., Englert, R. (eds) KI 2007: Advances in Artificial Intelligence. KI 2007. Lecture Notes in Computer Science(), vol 4667. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74565-5_18

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-74565-5_18

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-74564-8

  • Online ISBN: 978-3-540-74565-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics