Risk Sensitive Reinforcement Learning Scheme Is Suitable for Learning on a Budget
Risk-sensitive reinforcement learning (Risk-sensitiveRL) has been studied by many researchers. The methods are based on a prospect method, which imitates the value function of a human. Although they are mainly intended at imitating human behaviors, there are fewer discussions about the engineering meaning of it. In this paper, we show that Risk-sensitiveRL is useful for using online-learning machines whose resources are limited. In such a learning method, a part of the learned memories should be removed to create space for recording a new important instance. The experimental results show that risk-sensitive RL is superior to normal RL. This might mean that the human brain is also constructed by a limited number of neurons, so that humans hire the risk-sensitive value function for the learning.
This research has been supported by Grant-in-Aid for Scientific Research(c) 12008012.
- 1.Igushi, K., Ogiso, T., Yamauchi, K.: Acceleration of reinforcement learning via game-based renewal energy management system. In: SCISISIS 2014, pp. 415–420. The Institute of Electrical and Electronics Engineers, Inc., New York, December 2014Google Scholar
- 8.Orabona, F., Keshet, J., Caputo, B.: The projectron: a bounded kernel-based perceptron. In: ICML 2008, pp. 720–727 (2008)Google Scholar
- 9.He, W., Si, W.: A kernel-based perceptron with dynamic memory. Neural Netw. 25, 105–113 (2011)Google Scholar
- 10.Yamauchi, K.: Pruning with replacement and automatic distance metric detection in limited general regression neural networks. In: IJCNN 2011, pp. 899–906. The Institute of Electrical and Electronics Engineers, Inc., New York, July 2011Google Scholar
- 14.Kondo, Y., Yamauchi, K.: A dynamic pruning strategy for incremental learning on a budget. In: Loo, C.K., Yap, K.S., Wong, K.W., Teoh, A., Huang, K. (eds.) ICONIP 2014, Part I. LNCS, vol. 8834, pp. 295–303. Springer, Heidelberg (2014)Google Scholar
- 15.Kato, H., Yamauchi, K.: Quick MPPT microconverter using a limited general regression neural network with adaptive forgetting. In: 2015 International Conference on Sustainable Energy Engineering and Application (ICSEEA), pp. 42–48. The Institute of Electrical and Electronics Engineers, Inc., New York, February 2016Google Scholar
- 16.Sutton, R.S.: Generalization in reinforcement learning: successful examples using sparse coarse coding. Adv. Neural Inf. Process. Syst. 8, 1038–1044 (1995)Google Scholar
- 17.Ryuichi UEDA: Comparison of data amount for representing decision making policy. In: Burgard, W., Dillmann, R., Plagemann, C., Vahrenkamp, N. (eds.) Intelligent Autonomous Systems 10 (IAS 2010), vol. 10, pp. 26–35. IOS Press (2008)Google Scholar