Deep Feature-Action Processing with Mixture of Updates

Altahhan, Abdulrahman

doi:10.1007/978-3-319-26561-2_1

Abdulrahman Altahhan¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9492))

Included in the following conference series:

International Conference on Neural Information Processing

2310 Accesses
3 Citations

Abstract

This paper explores the possibility of combining an actor and critic in one architecture and uses a mixture of updates to train them. It describes a model for robot navigation that uses architecture similar to an actor-critic reinforcement learning architecture. It sets up the actor as a layer seconded by another layer which deduce the value function. Therefore, the effect is to have similar to a critic outcome combined with the actor in one network. The model hence can be used as the base for a truly deep reinforcement learning architecture that can be explored in the future. More importantly this work explores the results of mixing conjugate gradient update with gradient update for the mentioned architecture. The reward signal is back propagated from the critic to the actor through conjugate gradient eligibility trace for the second layer combined with gradient eligibility trace for the first layer. We show that this mixture of updates seems to work well for this model. The features layer have been deeply trained by applying a simple PCA on the whole set of images histograms acquired during the first running episode. The model is also able to adapt to a reduced features dimension autonomously. Initial experimental result on real robot shows that the agent accomplished good success rate in reaching a goal location.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Vardy, A., Moller, R.: Biologically plausible visual homing methods based on optical flow techniques. Connection Sci. 17, 47–89 (2005)
Article Google Scholar
Tomatis, N., et al.: Combining topological and metric: a natural integration for simultaneous localization and map building. In: Presented at Proceedings of the Fourth European Workshop on Advanced Mobile Robots (Eurobot) (2001)
Google Scholar
Zeil, J.: Visual homing: an insect perspective, Current Opinion in Neurobiology. 22(2), 285–293 (2012). ISSN 0959-4388
Google Scholar
Sutton, R.S., Barto, A.: Reinforcement Learning, an introduction. MIT Press, Cambridge (1998)
Google Scholar
Konda, V., Tsitsiklis, J.: Actor-Critic algorithms. In: Presented at NIPS 12 (2000)
Google Scholar
Ziv, O., Shimkin, N.: Multigrid methods for policy evaluation and reinforcement learning. In: Presented at IEEE International Symposium on Intelligent Control, Limassol (2005)
Google Scholar
Zhang, C., et al.: Efficient multi-agent reinforcement learning through automated supervision. In: Presented at International Conference on Autonomous Agents Estoril, Portugal (2008)
Google Scholar
Bhatnagar, S., et al.: Incremental natural actor-critic algorithms. In: Presented at Neural Information Processing Systems (NIPS19) (2007)
Google Scholar
Hinton, G., et al.: A fast learning algorithm for deep belief nets. Neural Comput. 18(7), 1527–1554 (2006)
Article MathSciNet MATH Google Scholar
Coates, A., et al.: An analysis of single-layer networks in unsupervised feature learning. In: AISTATS 14 (2011)
Google Scholar
Vincent, P., et al.: Extracting and composing robust features with denoising autoencoders. In: ICML (2008)
Google Scholar
Andrew, Ng et al.: Tutorial in Deep Learning: Stanford University (2010). http://ufldl.stanford.edu/tutorial/
LeCun, Y., et al.: Learning methods for generic object recognition with invariance to pose and lighting. In: CVPR (2004)
Google Scholar
Altahhan, A.: A robot visual homing model that traverses conjugate gradient TD to a variable λ TD and uses radial basis features. In: Mellouk, A. (ed.) Advances in Reinforcement Learning, pp. 225–254. InTech Education and Publishing, Vienna (2011)
Google Scholar
Altahhan, A.: Robot visual homing using conjugate gradient temporal difference learning, radial basis features and a whole image measure. In: International Joint Conference on Neural Networks (IJCNN), Barcelona, Spain (2010). ISBN: 978-1-4244-6916-1
Google Scholar
Altahhan, A., et al.: Visual robot homing using sarsa(λ), whole image measure, and radial basis function. In: International Joint Conference on Neural Networks (IJCNN), Hong Kong (2008)
Google Scholar
Nocedal, J., Wright, S.: Numerical Optimization. Springer-Verlag, New York, 978-0-387-30303-1, 2^nd Edition (2006)
Google Scholar
Sutton, R.S., et al.: A new Q(lambda) with interim forward view and Monte Carlo equivalence. In: Proceedings of the 31 st International Conference on Machine Learning, Beijing, China, 2014. JMLR: W&CP vol. 32 (2014)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computing, Coventry University, Coventry, CV1 5FB, UK
Abdulrahman Altahhan

Authors

Abdulrahman Altahhan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Abdulrahman Altahhan .

Editor information

Editors and Affiliations

University of Istanbul, Istanbul, Turkey
Sabri Arik
University at Qatar, Doha, Qatar
Tingwen Huang
Tunku Abdul Rahman University College, Kuala Lumpur, Malaysia
Weng Kin Lai
University of Science Technology, Wuhan, China
Qingshan Liu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Altahhan, A. (2015). Deep Feature-Action Processing with Mixture of Updates. In: Arik, S., Huang, T., Lai, W., Liu, Q. (eds) Neural Information Processing. ICONIP 2015. Lecture Notes in Computer Science(), vol 9492. Springer, Cham. https://doi.org/10.1007/978-3-319-26561-2_1

Download citation

DOI: https://doi.org/10.1007/978-3-319-26561-2_1
Published: 18 November 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-26560-5
Online ISBN: 978-3-319-26561-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics