Actor-Critic Models and the A3C

Sewak, Mohit

doi:10.1007/978-981-13-8285-7_11

Mohit Sewak²

8152 Accesses
9 Citations

Abstract

In this chapter, we will take the idea of the policy-gradient-based REINFORCE with baseline algorithm further and combine that idea with the value-estimation ideas from the DQN, thus, bringing the best of both worlds together in the form of the Actor-Critic algorithm. We will further discuss the “advantage” baseline implementation of the model with deep learning-based approximators, and take the concept further to implement a parallel implementation of the deep learning-based advantage actor-critic algorithm in the synchronous (A2C) and the asynchronous (A3C) modes.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 139.00; Price excludes VAT (USA)

Softcover Book: USD 179.99; Price excludes VAT (USA)

Hardcover Book: USD 179.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Author information

Authors and Affiliations

Pune, Maharashtra, India
Mohit Sewak

Authors

Mohit Sewak
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mohit Sewak .

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Sewak, M. (2019). Actor-Critic Models and the A3C. In: Deep Reinforcement Learning. Springer, Singapore. https://doi.org/10.1007/978-981-13-8285-7_11

Download citation

DOI: https://doi.org/10.1007/978-981-13-8285-7_11
Published: 28 June 2019
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-8284-0
Online ISBN: 978-981-13-8285-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics