Emergent Robot Differentiation for Distributed Multi-Robot Task Allocation

  • Torbjørn S. Dahl
  • Maja J. Matarić
  • Gaurav S. Sukhatme
Conference paper


We present a distributed mechanism for automatically allocating tasks to robots in a manner sensitive to each robot’s performance level without handcoding these levels in advance. This mechanism is an important part of improving multi-robot task allocation (MRTA) in systems where communication is restricted or where the complexity of the group dynamics makes it necessary to make allocation decisions locally. The general mechanism is demonstrated as an improvement on our previously published task allocation through vacancy chains (TAVC) algorithm for distributed MRTA. The TAVC algorithm uses individual reinforcement learning of task utilities and relies on the specializing abilities of the members of the group to produce dedicated optimal allocations. Through experiments with realistic simulator we evaluate the improved algorithm by comparing it to random allocation. We conclude that using softmax action selection functions on task utility values makes algorithms responsive to different performance levels in a group of heterogeneous robots.


Task Allocation Reward Function Average Reward Traversal Time Task Utility 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    T. R. Balch. Reward and Diversity in Multirobot Foraging. In Proceedings of the 16th International Joint Conference on Artificial Intelligence (IJCAI’99) Workshop: Learning About, From and With other Agents, Stockholm, Sweden, July 31–August 6 1999.Google Scholar
  2. 2.
    Tucker R. Balch. The impact of diversity on performance in multi-robot foraging. In Oren Etzioni, Jörg P. Müller, and Jeffrey M. Bradshaw, editors, The proceedings of the Third International Conference on Autonomous Agents (Agents’99), pages 92–99, Seattle, Washington, May 1–5 1999. ACM Press.Google Scholar
  3. 3.
    Wilfried Brauer and Gerhard Weiß. Multi-machine scheduling — a multi-agent learning approach. In Proceedings of the 3rd International Conference on Multi-Agent Systems (ICMAS’98), pages 42–48, Paris, Prance, July 4–7 1998. IEEE Press.Google Scholar
  4. 4.
    Ivan D. Chase, Marc Weissburg, and Theodore H. Dewitt. The vacancy chain process: a new mechanism of resource distribution in animals with application to hermit crabs. Animal Behavior, 36:1265–1274, 1988.CrossRefGoogle Scholar
  5. 5.
    Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein. Introduction to algorithms. MIT Press, Cambridge, Massachusetts, second edition, 2001.MATHGoogle Scholar
  6. 6.
    Torbjørn S. Dahl, Maja J Matarić, and Gaurav S. Sukhatme. Scheduling with group dynamics: A multi-robot task allocation algorithm based on vacancy chains. Technical Report CRES-002-07, Center for Robotics and Embedded Systems, University of Southern California, Los angeles, CA, 2002.Google Scholar
  7. 7.
    Torbjørn S. Dahl, Maja J. Matarić, and Gaurav S. Sukhatme. Multi-robot task-allocation through vacancy chains. In Proceedings of the 2003 IEEE International Conference on Robotics and Automation (ICRA’ 03), pages 2293–2298, Taipei, Taiwan, September 9–14 2003. IEEE Press.Google Scholar
  8. 8.
    Brian P. Gerkey and Maja J Matarić. Sold!: Auction methods for multi-robot coordination. IEEE Transactions on Robotics and Automation, 18(5):758–768, October 2002.Google Scholar
  9. 9.
    Dani Goldberg and Maja J Matarić. Learning multiple models for reward maximization. In Pat Langley, editor, Proceedings of the 17th International Conference on Machine Learning (ICML’00), pages 319–326, Stanford, California, June 29–July 2 2000. Morgan Kaufmann.Google Scholar
  10. 10.
    Kristina Lerman, Asram. Galstyan, Alcherio Martinoli, and Auke J. Ijspeert. A macroscopic analytical model of collaboration in distributed robotic systems. Artificial Life, 7(4):375–393, 2001.CrossRefGoogle Scholar
  11. 11.
    Maja J. Matarić. Behavior-based control: Examples from navigation, learning, and group behavior. Journal of Experimental and Theoretical Artificial Intelligence, special issue on Software Architectures for Physical Agents, 9(2–3):323–336, 1997.Google Scholar
  12. 12.
    Lynne E. Parker. L-ALLIANCE: Task-Oriented Multi-Robot Learning in Behaviour-Based Systems. Advanced Robotics, Special Issue on Selected Papers from IROS’96, 11(4):305–322, 1997.Google Scholar
  13. 13.
    Richard S. Sutton and Andrew G. Barto. Reinforcement learning: an introduction. MIT Press, Cambridge, Massachusetts, 1998.Google Scholar
  14. 14.
    Helen Yan and Maja J Matarić. General spatial features for analysis of multirobot and human activities from raw position data. In Proceedings of the 2002 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS’02), pages 2770–2775, Lausanne, Switzerland, September 30–October 4 2002. IEEE Press.Google Scholar

Copyright information

© Springer 2007

Authors and Affiliations

  • Torbjørn S. Dahl
    • 1
  • Maja J. Matarić
    • 2
  • Gaurav S. Sukhatme
    • 2
  1. 1.Norwegian Defence Research Establishment (FFI)KjellerNorway
  2. 2.Center for Robotics and Embedded SystemsUniversity of Southern CaliforniaLos Angeles

Personalised recommendations