Efficient Reward Functions for Adaptive Multi-rover Systems

Tumer, Kagan; Agogino, Adrian

doi:10.1007/11691839_11

Kagan Tumer²² &
Adrian Agogino²³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3898))

Included in the following conference series:

International Workshop on Learning and Adaption in Multi-Agent Systems

888 Accesses
1 Citations

Abstract

This chapter focuses on deriving reward functions that allow multiple agents to co-evolve efficient control policies that maximize a system level reward in noisy and dynamic environments. The solution we present is based on agent rewards satisfying two crucial properties. First, the agent reward function and global reward function has to be aligned, that is, an agent maximizing its agent-specific reward should also maximize the global reward. Second, the agent has to receive sufficient “signal” from its reward, that is, an agent’s action should have a large influence over its agent-specific reward. Agents using rewards with these two properties will evolve the correct policies quickly. This hypothesis is tested in episodic and non-episodic, continuous-space multi-rover environment where rovers evolve to maximize a global reward function over all rovers. The environments are dynamic (i.e. changes over time), noisy and have restriction on communication between agents. We show that a control policy evolved using agent-specific rewards satisfying the above properties outperforms policies evolved using global rewards by up to 400%. More notably, in the presence of a larger number of rovers or rovers with noisy and communication limited sensors, the proposed method outperforms global reward by a higher percentage than in noise-free conditions with a small number of rovers.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Agah, A., Bekey, G.A.: A genetic algorithm-based controller for decentralized multi-agent robotic systems. Proc. of the IEEE International Conference of Evolutionary Computing, Nagoya, Japan (1996)
Google Scholar
Agogino, A., Stanley, K., Miikkulainen, R.: Online interactive neuro-evolution. Neural Processing Letters 11, 29–38 (2000)
Article Google Scholar
Agogino, A., Tumer, K.: Efficient evaluation functions for multi-rover systems. In: The Genetic and Evolutionary Computation Conference, Seatle, WA, June 2004, pp. 1–12 (2004)
Google Scholar
Balch, T.: Behavioral diversity as multiagent cooperation. In: Proc. of SPIE 1999 Workshop on Multiagent Systems, Boston, MA (1999)
Google Scholar
Baldassarre, G., Nolfi, S., Parisi, D.: Evolving mobile robots able to display collective behavior. Artificial Life 9, 255–267 (2003)
Article Google Scholar
Dorigo, M., Gambardella, L.M.: Ant colony systems: A cooperative learning approach to the travelling salesman problem. IEEE Transactions on Evolutionary Computation 1(1), 53–66 (1997); Efficient Reward Functions for Adaptive Multi-rover Systems 191
Article Google Scholar
Farritor, S., Dubowsky, S.: Planning methodology for planetary robotic exploration. ASME Journal of Dynamic Systems, Measurement and Control 124, pages 4, 698–701 (2002)
Article Google Scholar
Floreano, D., Mondada, F.: Automatic creation of an autonomous agent: Genetic evolution of a neural-network driven robot. In: Proc. of Conf. on Simulation of Adaptive Behavior (1994)
Google Scholar
Gomez, F., Miikkulainen, R.: Active guidance for a finless rocket through neuroevolution. In: Proc. of Genetic and Evolutionary Comp. Conf., Chicago, IL (2003)
Google Scholar
Hoffmann, F., Koo, T.-J., Shakernia, O.: Evolutionary design of a helicopter autopilot. In: Advances in Soft Computing - Engineering Design and Manufacturing, Part 3: Intelligent Control, pp. 201–214 (1999)
Google Scholar
Lamma, E., Riguzzi, F., Pereira, L.: Belief revision by multi-agent genetic search. In: Proc. of the 2nd International Workshop on Computational Logic for Multi- Agent Systems, Paphos, Cyprus (December 2001)
Google Scholar
Martinoli, A., Ijspeert, A.J., Mondala, F.: Understanding collective aggregation mechanisms: From probabilistic modelling to experiments with real robots. Robotics and Autonomous Systems 29, 51–63 (1999)
Article Google Scholar
Martinoli, A., Mondala, F.: Collective and cooperative group behaviors: Biologically inspired experiments in robotics. In: Khatib, O., Salisbur, J. (eds.) Proc. of the Fourth Intl. Symp. on Experimental Robotics. Springer, New York (1995)
Google Scholar
Mataric, M.J.: Coordination and learning in multi-robot systems. IEEE Intelligent Systems, 6–8 (March 1998)
Google Scholar
Stanley, K., Miikkulainen, R.: Efficient reinforcement learning through evolving neural network topologies. In: Proceedings of the Genetic and Evolutionary Computation Conference (GECCO 2002), San Francisco, CA (2002)
Google Scholar
Tumer, K., Agogino, A.: Overcoming communication restrictions in collectives. In: Proceedings of the International Joint Conference on Neural Networks, Budapest, Hungary (July 2004)
Google Scholar
Tumer, K., Wolpert, D. (eds.): Collectives and the Design of Complex Systems. Springer, New York (2004)
MATH Google Scholar
Tumer, K., Wolpert, D.: A survey of collectives. In: Collectives and the Design of Complex Systems, vol. 42, p. 1. Springer, Heidelberg (2004)
Chapter Google Scholar
Tumer, K., Wolpert, D.H.: Collective intelligence and Braess paradox. In: Proceedings of the Seventeenth National Conference on Artificial Intelligence, Austin, TX, pp. 104–109 (2000)
Google Scholar
Whitley, D., Gruau, F., Pyeatt, L.: Cellular encoding applied to neurocontrol. In: International Conference on Genetic Algorithms (1995)
Google Scholar
Wolpert, D.H., Tumer, K.: Optimal payoff functions for members of collectives. Advances in Complex Systems 4(2/3), 265–279 (2001)
Article MATH Google Scholar

Download references

Author information

Authors and Affiliations

NASA Ames Research Center, Mail Stop 269-4, Moffet Field, CA, 94035, USA
Kagan Tumer
UC Santa Cruz, NASA Ames Research Center, Mail Stop 269-3, Moffet Field, CA, 94035, USA
Adrian Agogino

Authors

Kagan Tumer
View author publications
You can also search for this author in PubMed Google Scholar
Adrian Agogino
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

MICC-IKAT, Universiteit Maastricht, The Netherlands
Karl Tuyls
Center for Mathematics and Computer Science (CWI), Kruislaan 413, P.O. Box 94079, 1090, Amsterdam, GB, The Netherlands
Pieter Jan’t Hoen
KaHo Sint-Lieven, Information Technology Group, Gebr. Desmetstraat 1, 9000, Gent, Belgium
Katja Verbeeck
Department of Mathematical and Computer Science, University of Tulsa, USA
Sandip Sen

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Tumer, K., Agogino, A. (2006). Efficient Reward Functions for Adaptive Multi-rover Systems. In: Tuyls, K., Hoen, P.J., Verbeeck, K., Sen, S. (eds) Learning and Adaption in Multi-Agent Systems. LAMAS 2005. Lecture Notes in Computer Science(), vol 3898. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11691839_11

Download citation

DOI: https://doi.org/10.1007/11691839_11
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-33053-0
Online ISBN: 978-3-540-33059-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics