Skip to main content

Pruning Dominated Policies in Multiobjective Pareto Q-Learning

  • Conference paper
  • First Online:
Advances in Artificial Intelligence (CAEPIA 2018)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11160))

Included in the following conference series:

Abstract

The solution for a Multi-Objetive Reinforcement Learning problem is a set of Pareto optimal policies. MPQ-learning is a recent algorithm that approximates the whole set of all Pareto-optimal deterministic policies by directly generalizing Q-learning to the multiobjective setting. In this paper we present a modification of MPQ-learning that avoids useless cyclical policies and thus improves the number of training steps required for convergence.

Supported by: the Spanish Government, Agencia Estatal de Investigación (AEI) and European Union, Fondo Europeo de Desarrollo Regional (FEDER), grant TIN2016-80774-R (AEI/FEDER, UE); and Plan Propio de Investigación de la Universidad de Málaga - Campus de Excelencia Internacional Andalucía Tech.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Drugan, M., Wiering, M., Vamplew, P., Chetty, M.: Editorial: special issue on multi-objective reinforcement learning. Neurocomputing 263, 1–2 (2017)

    Article  Google Scholar 

  2. Roijers, D.M., Vamplew, P., Whiteson, S., Dazeley, R.: A survey of multi-objective sequential decision-making. J. Artif. Intell. Res. (JAIR) 48, 67–113 (2013)

    Article  MathSciNet  Google Scholar 

  3. Ruiz-Montiel, M., Mandow, L., Perez-de-la Cruz, J.-L.: A temporal difference method for multi-objective reinforcement learning. Neurocomputing 263, 15–25 (2017)

    Article  Google Scholar 

  4. Vamplew, P., Dazeley, R., Berry, A., Issabekov, R., Dekker, E.: Empirical evaluation methods for multiobjective reinforcement learning algorithms. Mach. Learn. 84(1–2), 51–80 (2011)

    Article  MathSciNet  Google Scholar 

  5. Vamplew, P., Yearwood, J., Dazeley, R., Berry, A.: On the limitations of scalarisation for multi-objective reinforcement learning of Pareto fronts. In: Wobcke, W., Zhang, M. (eds.) AI 2008. LNCS (LNAI), vol. 5360, pp. 372–378. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-89378-3_37

    Chapter  Google Scholar 

  6. Van Moffaert, K., Nowé, A.: Multi-objective reinforcement learning using sets of Pareto dominating policies. J. Mach. Learn. Res. 15, 3663–3692 (2014)

    MathSciNet  MATH  Google Scholar 

  7. Wiering, M.A., Withagen, M., Drugan, M.M.: Model-based multi-objective reinforcement learning. In: 2014 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL) (2014)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lawrence Mandow .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Mandow, L., Pérez-de-la-Cruz, JL. (2018). Pruning Dominated Policies in Multiobjective Pareto Q-Learning. In: Herrera, F., et al. Advances in Artificial Intelligence. CAEPIA 2018. Lecture Notes in Computer Science(), vol 11160. Springer, Cham. https://doi.org/10.1007/978-3-030-00374-6_23

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-00374-6_23

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-00373-9

  • Online ISBN: 978-3-030-00374-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics