Skip to main content

Subgoal Identification for Reinforcement Learning and Planning in Multiagent Problem Solving

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4687))

Abstract

We provide a new probability flow analysis algorithm to automatically identify subgoals in a problem space. Our flow analysis, inspired by preflow-push algorithms, measures the topological structure of the problem space to identify states that connect different subset of state space as the subgoals within linear-time complexity. Then we apply a hybrid approach known as subgoal-based SMDP (semi-Markov Decision Process) that is composed of reinforcement learning and planning based on the identified subgoals to solve the problem in a multiagent environment. The effectiveness of this new method used in a multiagent system is demonstrated and evaluated using a capture-the-flag scenario. We showed also that the cooperative coordination emerged between two agents in the scenario through distributed policy learning.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Ahuja, R.K., Magnanti, T.L., Orlin, J.B.: Network Flows: Theory, Algorithms, and Applications. Prentice-Hall, Englewood Cliffs (1993)

    Google Scholar 

  2. Barto, A.G., Mahadevan, S.: Recent Advances in Hierarchical Reinforcement Learning. Discrete Event Dynamic Systems 13(4), 341–379 (2003)

    Article  MathSciNet  Google Scholar 

  3. Botea, A., Müller, M., Schaeffer, J.: Using Component Abstraction for Automatic Generation of Macro-Actions. In: Proceedings of the Fourteenth International Conference on Automated Planning and Scheduling, pp. 181–190. AAAI Press, Stanford, California, USA (2004)

    Google Scholar 

  4. Digney, B.: Learning Hierarchical Control Structure for Multiple Tasks and Changing Environments. In: Proceedings of the Fifth Conference on the Simulation of Adaptive Behavior (1998)

    Google Scholar 

  5. Dietterich, T.: Hierarchical reinforcement learning with the MAXQ value function decomposition. Journal of Artificial Intelligence Research 13, 227–303 (2000)

    MATH  MathSciNet  Google Scholar 

  6. Erol, K., Hendler, J., Nau, D.: Complexity results for HTN planning. Annals of Mathematics and Artificial Intelligence 18(1), 69–93 (1996)

    Article  MATH  MathSciNet  Google Scholar 

  7. Knoblock, C.A.: Automatically Generating Abstractions for Planning. Artificial Intelligence 68(2), 243–302 (1994)

    Article  MATH  Google Scholar 

  8. Mannor, S., Menache, I., Hoze, A., Klein, U.: Dynamic abstraction in reinforcement learning via clustering. In: Proceedings of the Twenty-First International Conference on Machine Learning, pp. 560–567. ACM Press, New York (2004)

    Google Scholar 

  9. McGovern, A., Barto, A.G.: Automatic discovery of subgoals in reinforcement learning using diverse density. In: Proceedings of the Eighteenth International Conference on Machine Learning, pp. 361–368. Morgan Kaufmann, San Francisco (2001)

    Google Scholar 

  10. Menache, I., Mannor, S., Shimkin, N.: Q-Cut - Dynamic discovery of sub-goals in reinforcement learning. In: Elomaa, T., Mannila, H., Toivonen, H. (eds.) ECML 2002. LNCS (LNAI), vol. 2430, pp. 295–306. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  11. Ghavamzadeh, M., Mahadevan, S., Makar, R.: Hierarchical multi-agent reinforcement learning. Journal of Autonomous Agents and Multiagent Systems 13(2), 197–229 (2006)

    Article  Google Scholar 

  12. Şimşek, Ö., Wolfe, A.P., Barto, A.G.: Identifying Useful Subgoals in Reinforcement Learning by Local Graph Partitioning. In: Proceedings of the Twenty-Second International Conference on Machine Learning, pp. 816–823. ACM Press, New York (2005)

    Google Scholar 

  13. Şimşek, Ö., Barto, A.G.: Using relative novelty to identify useful temporal abstractions in reinforcement learning. In: Proceedings of the Twenty-First International Conference on Machine Learning, pp. 751–758. ACM Press, New York (2004)

    Google Scholar 

  14. Shi, J., Malik, J.: Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence 22(8), 888–905 (2000)

    Article  Google Scholar 

  15. Sutton, R.S., Precup, D., Singh, S.P.: Between MDPs and Semi-MDPs: A framework for temporal abstraction in reinforcement learning. Artificial Intelligence 112(1), 181–211 (1999)

    Article  MATH  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Paolo Petta Jörg P. Müller Matthias Klusch Michael Georgeff

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Chiu, CC., Soo, VW. (2007). Subgoal Identification for Reinforcement Learning and Planning in Multiagent Problem Solving. In: Petta, P., Müller, J.P., Klusch, M., Georgeff, M. (eds) Multiagent System Technologies. MATES 2007. Lecture Notes in Computer Science(), vol 4687. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74949-3_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-74949-3_4

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-74948-6

  • Online ISBN: 978-3-540-74949-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics