Multi-Robot Learning in a Cooperative Observation Task
An important need in multi-robot systems is the development of mechanisms that enable robot teams to autonomously generate cooperative behaviors. This paper first briefly presents the Cooperative Multi-robot Observation of Multiple Moving Targets (CMOMMT) application as a rich domain for studying the issues of multi-robot learning of new behaviors. We discuss the results of our hand-generated algorithm for CMOMMT, and then describe our research in generating multi-robot learning techniques for the CMOMMT application, comparing the results to the hand-generated solutions. Our results show that, while the learning approach performs better than random, naive approaches, much room still remains to match the results obtained from the hand-generated approach. The ultimate goal of this research is to develop techniques for multi-robot learning and adaptation that will generalize to cooperative robot applications in many domains, thus facilitating the practical use of multi-robot teams in a wide variety of real-world applications.
KeywordsPartially Observable Markov Decision Process Cooperative Task Robot Team Neighboring Robot Cooperative Robot
Unable to display preview. Download preview PDF.
- 1.D. Aha, editor. Lazy Learning. Kluwer Academic Publishers, 1997.Google Scholar
- 2.M. Benda, V. Jagannathan, and R. Dodhiawalla. On optimal cooperation of knowledge sources. Technical Report BCS-G2010-28, Boeing AI Center, August 1985.Google Scholar
- 3.Thomas Haynes and Sandip Sen. Evolving behavioral strategies in predators and prey. In Gerard Weiss and Sandip Sen, editors, Adaptation and Learning in Multi-Agent Systems ,pages 113–126. Springer, 1986.Google Scholar
- 4.S. Mahadevan and J. Connell. Automatic programming of behavior-based robots using reinforcement learning. In Proceedings of AAAI-91 ,pages 8–14, 1991.Google Scholar
- 5.S. Marsella, J. Adibi, Y. Al-Onaizan, G. Kaminka, I. Muslea, and M. Tambe. On being a teammate: Experiences acquired in the design of RoboCup teams. In O. Etzioni, J. Muller, and J. Bradshaw, editors, Proceedings of the Third Annual Conference on Autonomous Agents ,pages 221–227, 1999.CrossRefGoogle Scholar
- 6.Maja Mataric. Interaction and Intelligent Behavior. PhD thesis, Massachusetts Institute of Technology, 1994.Google Scholar
- 7.L. E. Parker. A case study for life-long learning and adaptation in cooperative robot teams. In Proceedings of SPIE Sensor Fusion and Decentralized Control in Robotic Systems II ,volume 3839, pages 92–101, 1999.Google Scholar
- 8.J.W. Sheppard and S.L. Salzberg. A teaching strategy for memory-based control. In D. Aha, editor, Lazy Learning ,pages 343–370. Kluwer Academic Publishers, 1997.Google Scholar
- 9.Randall Steeb, Stephanie Cammarata, Frederick Hayes-Roth, Perry Thorndyke, and Robert Wesson. Distributed intelligence for air fleet control. Technical Report R-2728-AFPA, Rand Corp., 1981.Google Scholar
- 11.C. Watkins. Learning from Delayed Rewards. PhD thesis, King’s College, Cambridge, 1989.Google Scholar