Skip to main content

Part of the book series: Springer Theses ((Springer Theses))

  • 3788 Accesses

Abstract

Reinforcement learning is a machine learning method that enables the learning of optimal behavior in challenging or uncertain environments. Optimal behavior in this case can be defined as the set of sequential decisions that result in the achievement of a goal or the best possible outcome. This learning process can be regarded as a process of trial-and-error, which is coupled with feedback provided from the environment that indicates the utility of the outcome. This learning method ultimately attempts to learn a mapping between actions and outcomes.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • Dann, C., Neumann, G., & Peters, J. (2014). Policy evaluation with temporal differences: A survey and comparison. Journal of Machine Learning Research, 15(1), 809–883.

    MathSciNet  Google Scholar 

  • Gatti, C. J., Embrechts, M. J., & Linton, J. D. (2011a). Parameter settings of reinforcement learning for the game of Chung Toi. In Proceedings of the 2011 IEEE International Conference on Systems, Man, and Cybernetics (SMC 2011), Anchorage, AK, 9–12 October (pp. 3530–3535). doi: 10.1109/ICSMC.2011.6084216

    Google Scholar 

  • Gatti, C. J., Embrechts, M. J., & Linton, J. D. (2013). An empirical analysis of reinforcement learning using design of experiments. In Proceedings of the 21st European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning (ESANN), Bruges, Belgium, 24–26 April (pp. 221–226). Bruges, Belgium: ESANN.

    Google Scholar 

  • Kalyanakrishnan, S. & Stone, P. (2009). An empirical analysis of value function-based and policy search reinforcement learning. In Proceedings of the 8th International Conference on Autonomous Agents and Multiagent Systems (AAMAS '09), Budapest, Hungary, 10–15 May (Vol. 2, pp. 749–756). Richland, SC: International Foundation for Autonomous Agents and Multiagent Systems.

    Google Scholar 

  • Kalyanakrishnan, S. & Stone, P. (2011). Characterizing reinforcement learning methods through parameterized learning problems. Machine Learning, 84(1–2), 205–247.

    Article  MathSciNet  Google Scholar 

  • Mahadevan, S. & Theocharous, G. (1998). Optimizing production manufacturing using reinforcement learning. In Cook, D. J. (Ed.) Proceedings of the 11th International Florida Artificial Intelligence Research Society Conference, Sanibel Island, Florida, 18–20 May (pp. 372–377). AAAI Press.

    Google Scholar 

  • Ng, A. Y., Coates, A., Diel, M., Ganapathi, V., Schulte, J., Tse, B., Berger, E. & Liang, E. (2004). Autonomous inverted helicopter flight via reinforcement learning. In International Symposium on Experimental Robotics (ISER-2004), Singapore, 18–21 June (pp. 363–372). Cambridge, MA: MIT Press.

    Google Scholar 

  • Silver, D., Sutton, R. S., & Müller, M. (2012). Temporal-difference search in computer Go. Machine Learning, 87(2), 183–219.

    Article  MATH  MathSciNet  Google Scholar 

  • Smart, W. D. & Kaelbling, L. P. (2002). Effective reinforcement learning for mobile robots. In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), Washington, D.C., 11–15 May (Vol. 4, pp. 3404–3410). doi: 10.1109/ROBOT.2002.1014237

    Google Scholar 

  • Sutton, R. S. & Barto, A. G. (1998). \textitReinforcement Learning. Cambridge, MA: MIT Press.

    Google Scholar 

  • Tesauro, G. (1992). Practical issues in temporal difference learning. Machine Learning, 8(3–4), 257–277.

    MATH  Google Scholar 

  • Tesauro, G. (1995). Temporal difference learning and TD-Gammon. Communications of the ACM, 38(3), 58–68.

    Article  Google Scholar 

  • van Eck, N. J. & van Wezel, M. (2008). Application of reinforcement learning to the game of othello. Computers & Operations Research, 35(6), 1999–2017.

    Article  MATH  MathSciNet  Google Scholar 

  • Veness, J., Silver, D., Uther, W., & Blair, A. (2009). Bootstrapping from game tree search. In Bengio, Y., Schuurmans, D., Lafferty, J. D., Williams, C. K. I., & Culotta, A. (Eds.), Advances in Neural Information Processing Systems 22 (pp. 1937–1945). Red Hook, NY: Curran Associates, Inc.

    Google Scholar 

  • Whiteson, S., Tanner, B., Taylor, M. E., & Stone, P. (2009). Generalized domains for empirical evaluations in reinforcement learning. In Proceedings of the 26th International Conference on Machine Learning: Workshop on Evaluation Methods for Machine Learning, Montreal, Canada, 14–18 June. Retrieved from http://www.site.uottawa.ca/ICML09WS/papers/w8.pdf

  • Whiteson, S., Tanner, B., Taylor, M. E., & Stone, P. (2011). Protecting against evaluation overfitting in empirical reinforcement learning. In Proceedings of the IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL), Paris, France, 11-15 April (pp. 120–127). doi: 10.1109/ADPRL.2011.5967363

    Google Scholar 

  • Wiering, M. A. (1995). TD learning of game evaluation functions with hierarchical neural architectures. Unpublished masters thesis, Department of Computer Science, University of Amsterdam, Amsterdam, Netherlands.

    Google Scholar 

  • Wiering, M. A., Patist, J. P., & Mannen, H. (2007). Learning to play board games using temporal difference methods (Technical Report UU–CS–2005–048, Institute of Information and Computing Sciences, Utrecht University). Retrieved from http://www.ai.rug.nl/ ~mwiering/GROUP/ARTICLES/learning_games_TR.pdf.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Christopher Gatti .

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Gatti, C. (2015). Introduction. In: Design of Experiments for Reinforcement Learning. Springer Theses. Springer, Cham. https://doi.org/10.1007/978-3-319-12197-0_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-12197-0_1

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-12196-3

  • Online ISBN: 978-3-319-12197-0

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics