Abstract
In many real-world reinforcement learning problems, an agent needs to control multiple actions simultaneously. To learn under this circumstance, previously, each action was commonly treated independently with other. However, these multiple actions are rarely independent in applications, and it could be helpful to accelerate the learning if the underlying relationship among the actions is utilized. This paper explores multi-action relationship in reinforcement learning. We propose to learn the multi-action relationship by enforcing a regularization term capturing the relationship. We incorporate the regularization term into the least-square policy-iteration and the temporal-difference methods, which result efficiently solvable convex learning objectives. The proposed methods are validated empirically in several domains. Experiment results show that incorporating multi-action relationship can effectively improve the learning performance.
Y. Yu—This research was supported by the NSFC (61375061, 61223003), Foundation for the Author of National Excellent Doctoral Dissertation of China (201451).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Abbeel, P., Ganapathi, V., Ng, A.Y.: Learning vehicular dynamics, with application to modeling helicopters. In: Weiss, Y., Schölkopf, B., Platt, J. (eds.) Advances in Neural Information Processing Systems 18, pp. 1–8. MIT Press, Cambridge (2005)
Cheng, G., Hyon, S.H., Morimoto, J., Ude, A., Hale, J.G., Colvin, G., Scroggin, W., Jacobsen, S.C.: Cb: a humanoid research platform for exploring neuroscience. In: Proceedings of the 6th IEEE-RAS International Conference on Humanoid Robots, pp. 182–187, Genova, Italy (2006)
Dorigo, M., Gambardella, L.M.: Ant colony system: a cooperative learning approach to the traveling salesman problem. IEEE Trans. Evol. Comput. 1(1), 53–66 (1997)
Fumihide, T., Masayuki, Y.: Multitask reinforcement learning on the distribution of MDPs. In: Proceedings of the 2003 IEEE International Symposium on Computational Intelligence in Robotics and Automation, pp. 1108–1113. Kobe, Japan (2003)
Gupta, A.K., Nagar, D.K.: Matrix Variate Distributions. Chapman and Hall/CRC, Florida (1999)
van Hasselt, H.: Reinforcement learning in continuous state and action spaces (2012)
Lagoudakis, M.G., Parr, R.: Least-squares policy iteration. J. Mach. Learn. Res. 4, 1107–1149 (2003)
Pazis, J., Lagoudakis, M.G.: Reinforcement learning in multidimensional continuous action spaces. In: Proceedings of the 2011 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning, Paris, France (2011)
Rohanimanesh, K.: Concurrent Decision Making in Markov Decision Processes. Ph.D. thesis, University of Massachusetts Amherst (2006)
Silver, D., Sutton, R.S., Müller, M.: Temporal-difference search in computer go. Mach. Learn. 87(2), 183–219 (2012)
Sugiyama, M.: Statistical Reinforcement Learning: Modern Machine Learning Approaches. Chapman and Hall/CRC, Florida (2015)
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. The MIT Press, Cambridge (1998)
Wiering, M., van Otterlo, M. (eds.): Reinforcement Learning: State-of-the-Art. Springer, Berlin (2012)
Wunder, M., Littman, M.L., Babes, M.: Classes of multiagent q-learning dynamics with \(\varepsilon \)-greedy exploration. In: Proceedings of the 27th International Conference on Machine Learning, pp. 1167–1174, Haifa, Israel (2010)
Zhang, M.L., Zhou, Z.H.: A review on multi-label learning algorithms. IEEE Trans. Knowl. Data Eng. 26(8), 1819–1837 (2014)
Zhang, Y., Yeung, D.Y.: A regularization approach to learning task relationships in multitask learning. ACM Trans. Knowl. Discovery Data 8(3), 1–31 (2014)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Wang, H., Yu, Y. (2016). Exploring Multi-action Relationship in Reinforcement Learning. In: Booth, R., Zhang, ML. (eds) PRICAI 2016: Trends in Artificial Intelligence. PRICAI 2016. Lecture Notes in Computer Science(), vol 9810. Springer, Cham. https://doi.org/10.1007/978-3-319-42911-3_48
Download citation
DOI: https://doi.org/10.1007/978-3-319-42911-3_48
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-42910-6
Online ISBN: 978-3-319-42911-3
eBook Packages: Computer ScienceComputer Science (R0)