Skip to main content

Exploring Multi-action Relationship in Reinforcement Learning

  • Conference paper
  • First Online:
PRICAI 2016: Trends in Artificial Intelligence (PRICAI 2016)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9810))

Included in the following conference series:

Abstract

In many real-world reinforcement learning problems, an agent needs to control multiple actions simultaneously. To learn under this circumstance, previously, each action was commonly treated independently with other. However, these multiple actions are rarely independent in applications, and it could be helpful to accelerate the learning if the underlying relationship among the actions is utilized. This paper explores multi-action relationship in reinforcement learning. We propose to learn the multi-action relationship by enforcing a regularization term capturing the relationship. We incorporate the regularization term into the least-square policy-iteration and the temporal-difference methods, which result efficiently solvable convex learning objectives. The proposed methods are validated empirically in several domains. Experiment results show that incorporating multi-action relationship can effectively improve the learning performance.

Y. Yu—This research was supported by the NSFC (61375061, 61223003), Foundation for the Author of National Excellent Doctoral Dissertation of China (201451).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Abbeel, P., Ganapathi, V., Ng, A.Y.: Learning vehicular dynamics, with application to modeling helicopters. In: Weiss, Y., Schölkopf, B., Platt, J. (eds.) Advances in Neural Information Processing Systems 18, pp. 1–8. MIT Press, Cambridge (2005)

    Google Scholar 

  2. Cheng, G., Hyon, S.H., Morimoto, J., Ude, A., Hale, J.G., Colvin, G., Scroggin, W., Jacobsen, S.C.: Cb: a humanoid research platform for exploring neuroscience. In: Proceedings of the 6th IEEE-RAS International Conference on Humanoid Robots, pp. 182–187, Genova, Italy (2006)

    Google Scholar 

  3. Dorigo, M., Gambardella, L.M.: Ant colony system: a cooperative learning approach to the traveling salesman problem. IEEE Trans. Evol. Comput. 1(1), 53–66 (1997)

    Article  Google Scholar 

  4. Fumihide, T., Masayuki, Y.: Multitask reinforcement learning on the distribution of MDPs. In: Proceedings of the 2003 IEEE International Symposium on Computational Intelligence in Robotics and Automation, pp. 1108–1113. Kobe, Japan (2003)

    Google Scholar 

  5. Gupta, A.K., Nagar, D.K.: Matrix Variate Distributions. Chapman and Hall/CRC, Florida (1999)

    MATH  Google Scholar 

  6. van Hasselt, H.: Reinforcement learning in continuous state and action spaces (2012)

    Google Scholar 

  7. Lagoudakis, M.G., Parr, R.: Least-squares policy iteration. J. Mach. Learn. Res. 4, 1107–1149 (2003)

    MathSciNet  MATH  Google Scholar 

  8. Pazis, J., Lagoudakis, M.G.: Reinforcement learning in multidimensional continuous action spaces. In: Proceedings of the 2011 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning, Paris, France (2011)

    Google Scholar 

  9. Rohanimanesh, K.: Concurrent Decision Making in Markov Decision Processes. Ph.D. thesis, University of Massachusetts Amherst (2006)

    Google Scholar 

  10. Silver, D., Sutton, R.S., Müller, M.: Temporal-difference search in computer go. Mach. Learn. 87(2), 183–219 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  11. Sugiyama, M.: Statistical Reinforcement Learning: Modern Machine Learning Approaches. Chapman and Hall/CRC, Florida (2015)

    MATH  Google Scholar 

  12. Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. The MIT Press, Cambridge (1998)

    Google Scholar 

  13. Wiering, M., van Otterlo, M. (eds.): Reinforcement Learning: State-of-the-Art. Springer, Berlin (2012)

    Google Scholar 

  14. Wunder, M., Littman, M.L., Babes, M.: Classes of multiagent q-learning dynamics with \(\varepsilon \)-greedy exploration. In: Proceedings of the 27th International Conference on Machine Learning, pp. 1167–1174, Haifa, Israel (2010)

    Google Scholar 

  15. Zhang, M.L., Zhou, Z.H.: A review on multi-label learning algorithms. IEEE Trans. Knowl. Data Eng. 26(8), 1819–1837 (2014)

    Article  Google Scholar 

  16. Zhang, Y., Yeung, D.Y.: A regularization approach to learning task relationships in multitask learning. ACM Trans. Knowl. Discovery Data 8(3), 1–31 (2014)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yang Yu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Wang, H., Yu, Y. (2016). Exploring Multi-action Relationship in Reinforcement Learning. In: Booth, R., Zhang, ML. (eds) PRICAI 2016: Trends in Artificial Intelligence. PRICAI 2016. Lecture Notes in Computer Science(), vol 9810. Springer, Cham. https://doi.org/10.1007/978-3-319-42911-3_48

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-42911-3_48

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-42910-6

  • Online ISBN: 978-3-319-42911-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics