Multi-Robot Learning in a Cooperative Observation Task

  • Lynne E. Parker
  • Claude Touzet


An important need in multi-robot systems is the development of mechanisms that enable robot teams to autonomously generate cooperative behaviors. This paper first briefly presents the Cooperative Multi-robot Observation of Multiple Moving Targets (CMOMMT) application as a rich domain for studying the issues of multi-robot learning of new behaviors. We discuss the results of our hand-generated algorithm for CMOMMT, and then describe our research in generating multi-robot learning techniques for the CMOMMT application, comparing the results to the hand-generated solutions. Our results show that, while the learning approach performs better than random, naive approaches, much room still remains to match the results obtained from the hand-generated approach. The ultimate goal of this research is to develop techniques for multi-robot learning and adaptation that will generalize to cooperative robot applications in many domains, thus facilitating the practical use of multi-robot teams in a wide variety of real-world applications.


Partially Observable Markov Decision Process Cooperative Task Robot Team Neighboring Robot Cooperative Robot 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    D. Aha, editor. Lazy Learning. Kluwer Academic Publishers, 1997.Google Scholar
  2. 2.
    M. Benda, V. Jagannathan, and R. Dodhiawalla. On optimal cooperation of knowledge sources. Technical Report BCS-G2010-28, Boeing AI Center, August 1985.Google Scholar
  3. 3.
    Thomas Haynes and Sandip Sen. Evolving behavioral strategies in predators and prey. In Gerard Weiss and Sandip Sen, editors, Adaptation and Learning in Multi-Agent Systems ,pages 113–126. Springer, 1986.Google Scholar
  4. 4.
    S. Mahadevan and J. Connell. Automatic programming of behavior-based robots using reinforcement learning. In Proceedings of AAAI-91 ,pages 8–14, 1991.Google Scholar
  5. 5.
    S. Marsella, J. Adibi, Y. Al-Onaizan, G. Kaminka, I. Muslea, and M. Tambe. On being a teammate: Experiences acquired in the design of RoboCup teams. In O. Etzioni, J. Muller, and J. Bradshaw, editors, Proceedings of the Third Annual Conference on Autonomous Agents ,pages 221–227, 1999.CrossRefGoogle Scholar
  6. 6.
    Maja Mataric. Interaction and Intelligent Behavior. PhD thesis, Massachusetts Institute of Technology, 1994.Google Scholar
  7. 7.
    L. E. Parker. A case study for life-long learning and adaptation in cooperative robot teams. In Proceedings of SPIE Sensor Fusion and Decentralized Control in Robotic Systems II ,volume 3839, pages 92–101, 1999.Google Scholar
  8. 8.
    J.W. Sheppard and S.L. Salzberg. A teaching strategy for memory-based control. In D. Aha, editor, Lazy Learning ,pages 343–370. Kluwer Academic Publishers, 1997.Google Scholar
  9. 9.
    Randall Steeb, Stephanie Cammarata, Frederick Hayes-Roth, Perry Thorndyke, and Robert Wesson. Distributed intelligence for air fleet control. Technical Report R-2728-AFPA, Rand Corp., 1981.Google Scholar
  10. 10.
    P. Stone and M. Veloso. A layered approach to learning client behaviors in the robocup soccer server. Applied Artificial Intelligence ,12:165–188, 1998.CrossRefGoogle Scholar
  11. 11.
    C. Watkins. Learning from Delayed Rewards. PhD thesis, King’s College, Cambridge, 1989.Google Scholar
  12. 12.
    Gerhard Weiss and Sandip Sen, editors. Adaption and Learning in Multi-Agent Systems. Springer, 1996.MATHGoogle Scholar

Copyright information

© Springer-Verlag Tokyo 2000

Authors and Affiliations

  • Lynne E. Parker
    • 1
  • Claude Touzet
    • 1
  1. 1.Center for Engineering Science Advanced Research, Computer Science and Mathematics DivisionOak Ridge National LaboratoryOak RidgeUSA

Personalised recommendations