Abstract
This chapter focuses on deriving reward functions that allow multiple agents to co-evolve efficient control policies that maximize a system level reward in noisy and dynamic environments. The solution we present is based on agent rewards satisfying two crucial properties. First, the agent reward function and global reward function has to be aligned, that is, an agent maximizing its agent-specific reward should also maximize the global reward. Second, the agent has to receive sufficient “signal” from its reward, that is, an agent’s action should have a large influence over its agent-specific reward. Agents using rewards with these two properties will evolve the correct policies quickly. This hypothesis is tested in episodic and non-episodic, continuous-space multi-rover environment where rovers evolve to maximize a global reward function over all rovers. The environments are dynamic (i.e. changes over time), noisy and have restriction on communication between agents. We show that a control policy evolved using agent-specific rewards satisfying the above properties outperforms policies evolved using global rewards by up to 400%. More notably, in the presence of a larger number of rovers or rovers with noisy and communication limited sensors, the proposed method outperforms global reward by a higher percentage than in noise-free conditions with a small number of rovers.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Agah, A., Bekey, G.A.: A genetic algorithm-based controller for decentralized multi-agent robotic systems. Proc. of the IEEE International Conference of Evolutionary Computing, Nagoya, Japan (1996)
Agogino, A., Stanley, K., Miikkulainen, R.: Online interactive neuro-evolution. Neural Processing Letters 11, 29–38 (2000)
Agogino, A., Tumer, K.: Efficient evaluation functions for multi-rover systems. In: The Genetic and Evolutionary Computation Conference, Seatle, WA, June 2004, pp. 1–12 (2004)
Balch, T.: Behavioral diversity as multiagent cooperation. In: Proc. of SPIE 1999 Workshop on Multiagent Systems, Boston, MA (1999)
Baldassarre, G., Nolfi, S., Parisi, D.: Evolving mobile robots able to display collective behavior. Artificial Life 9, 255–267 (2003)
Dorigo, M., Gambardella, L.M.: Ant colony systems: A cooperative learning approach to the travelling salesman problem. IEEE Transactions on Evolutionary Computation 1(1), 53–66 (1997); Efficient Reward Functions for Adaptive Multi-rover Systems 191
Farritor, S., Dubowsky, S.: Planning methodology for planetary robotic exploration. ASME Journal of Dynamic Systems, Measurement and Control 124, pages 4, 698–701 (2002)
Floreano, D., Mondada, F.: Automatic creation of an autonomous agent: Genetic evolution of a neural-network driven robot. In: Proc. of Conf. on Simulation of Adaptive Behavior (1994)
Gomez, F., Miikkulainen, R.: Active guidance for a finless rocket through neuroevolution. In: Proc. of Genetic and Evolutionary Comp. Conf., Chicago, IL (2003)
Hoffmann, F., Koo, T.-J., Shakernia, O.: Evolutionary design of a helicopter autopilot. In: Advances in Soft Computing - Engineering Design and Manufacturing, Part 3: Intelligent Control, pp. 201–214 (1999)
Lamma, E., Riguzzi, F., Pereira, L.: Belief revision by multi-agent genetic search. In: Proc. of the 2nd International Workshop on Computational Logic for Multi- Agent Systems, Paphos, Cyprus (December 2001)
Martinoli, A., Ijspeert, A.J., Mondala, F.: Understanding collective aggregation mechanisms: From probabilistic modelling to experiments with real robots. Robotics and Autonomous Systems 29, 51–63 (1999)
Martinoli, A., Mondala, F.: Collective and cooperative group behaviors: Biologically inspired experiments in robotics. In: Khatib, O., Salisbur, J. (eds.) Proc. of the Fourth Intl. Symp. on Experimental Robotics. Springer, New York (1995)
Mataric, M.J.: Coordination and learning in multi-robot systems. IEEE Intelligent Systems, 6–8 (March 1998)
Stanley, K., Miikkulainen, R.: Efficient reinforcement learning through evolving neural network topologies. In: Proceedings of the Genetic and Evolutionary Computation Conference (GECCO 2002), San Francisco, CA (2002)
Tumer, K., Agogino, A.: Overcoming communication restrictions in collectives. In: Proceedings of the International Joint Conference on Neural Networks, Budapest, Hungary (July 2004)
Tumer, K., Wolpert, D. (eds.): Collectives and the Design of Complex Systems. Springer, New York (2004)
Tumer, K., Wolpert, D.: A survey of collectives. In: Collectives and the Design of Complex Systems, vol. 42, p. 1. Springer, Heidelberg (2004)
Tumer, K., Wolpert, D.H.: Collective intelligence and Braess paradox. In: Proceedings of the Seventeenth National Conference on Artificial Intelligence, Austin, TX, pp. 104–109 (2000)
Whitley, D., Gruau, F., Pyeatt, L.: Cellular encoding applied to neurocontrol. In: International Conference on Genetic Algorithms (1995)
Wolpert, D.H., Tumer, K.: Optimal payoff functions for members of collectives. Advances in Complex Systems 4(2/3), 265–279 (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Tumer, K., Agogino, A. (2006). Efficient Reward Functions for Adaptive Multi-rover Systems. In: Tuyls, K., Hoen, P.J., Verbeeck, K., Sen, S. (eds) Learning and Adaption in Multi-Agent Systems. LAMAS 2005. Lecture Notes in Computer Science(), vol 3898. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11691839_11
Download citation
DOI: https://doi.org/10.1007/11691839_11
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-33053-0
Online ISBN: 978-3-540-33059-2
eBook Packages: Computer ScienceComputer Science (R0)