Abstract
The mountain car problem is commonly used as a benchmark reinforcement learning problem to evaluate learning algorithms. The problem places a car in a valley, where the goal is to get the car to drive out of the valley (Fig. 5.1). The car’s engine is not powerful enough for it to drive out of the valley, and the car must instead build up momentum by successively driving up opposing sides of the valley. This chapter explores the mountain car problem using sequential CART and stochastic kriging to understand the parameter space.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Embrechts, M. J., Hargis, B. J., & Linton, J. D. (2010). An augmented efficient backpropagation training strategy for deep autoassociative neural networks. In Proceedings of the 2010 International Joint Conference on Neural Networks (IJCNN), Barcelona, Spain, 18–23 July (pp. 1–6). doi: 10.1109/IJCNN.2010. 5596828
Gatti, C. J., Embrechts, M. J., & Linton, J. D. (2013). An empirical analysis of reinforcement learning using design of experiments. In Proceedings of the \(21st\) European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning (ESANN), Bruges, Belgium, 24–26 April (pp. 221–226). Bruges, Belgium: ESANN.
LeCun,Y., Bottou, L., Orr, G.,, & Müller, K. (1998). Efficient backprop. In Orr, G. & Müller, K. (Eds.), Neural Networks: Tricks of the Trade, volume 1524 (pp. 5–50). Berlin: Springer.
Moore, A. W. (1990). Efficient memory-based learning for robot control. Unpublished PhD dissertation, University of Cambridge, Cambridge, United Kingdom.
Patist, J. P. & Wiering, M. (2004). Learning to play draughts using temporal difference learning with neural networks and databases. In Proceedings of the 13th Belgian-Dutch Conference on Machine Learning, Brussels, Belgium, 8–9 January (pp. 87–94). doi: 10.1007/978-3-540-88190-2_13
Tesauro, G. (1992). Practical issues in temporal difference learning. Machine Learning, 8(3–4), 257–277.
Wiering, M. A. (2010). Self-play and using an expert to learn to play backgammon with temporal difference learning. Journal of Intelligent Learning Systems & Applications, 2(2), 57–68.
Wiering, M. A., Patist, J. P., & Mannen, H. (2007). Learning to play board games using temporal difference methods (Technical Report UU–CS–2005–048, Institute of Information and Computing Sciences, Utrecht University). Retrieved from http://www.ai.rug.nl/\(\sim \)http://www.ai.rug.nl/ mwiering/group/articles/learning_games_TR.pdf
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Gatti, C. (2015). The Mountain Car Problem. In: Design of Experiments for Reinforcement Learning. Springer Theses. Springer, Cham. https://doi.org/10.1007/978-3-319-12197-0_5
Download citation
DOI: https://doi.org/10.1007/978-3-319-12197-0_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-12196-3
Online ISBN: 978-3-319-12197-0
eBook Packages: EngineeringEngineering (R0)