NmetaQ: An n-agent reinforcement learning algorithm based on meta equilibrium

Yujing Hu, Zhaonan Sun, Xingguo Chen, Yang Gao, Ruili Wang

Research output: Contribution to conferencePaperpeer-review

Abstract

Multi-agent reinforcement learning (MARL) has been widely studied over the last years. In MARL, one approach is to combine game theory with reinforcement learning (RL) to help with selecting actions and updating policies. Markov games are adopted in this approach as the framework and policies are learnt based on equilibrium theories. Several algorithms have been proposed based on this idea, such as minimax-Q, NashQ, FFQ, Correlated-Q and MetaQ. However, some of these algorithms are proposed only for 2-agent problems while the others have difficulty in dealing with problems with more than 2 agents. Since many tasks involve more than 2 agents in the real world, an algorithm which can deal with n-agent (n > 2) problems is needed. In this paper, we propose nMetaQ based on MetaQ. nMetaQ can be applied to a multi-agent environment that has more than 2 agents. Experimental results demonstrate the empirical convergence of nMetaQ and show its satisfactory adaptive performance. The most important advantage of nMetaQ is that it can work efficiently and effectively in an n-agent (n > 2) environment while previous methods may not.

Original languageEnglish
Pages87-94
Number of pages8
Publication statusPublished - 2012
Externally publishedYes
Event2012 Workshop on Adaptive and Learning Agents, ALA 2012 - Held in Conjunction with the 11th International Conference on Autonomous Agents and Multiagent Systems, AAMAS 2012 - Valencia, Spain
Duration: 4 Jun 20125 Jun 2012

Conference

Conference2012 Workshop on Adaptive and Learning Agents, ALA 2012 - Held in Conjunction with the 11th International Conference on Autonomous Agents and Multiagent Systems, AAMAS 2012
Country/TerritorySpain
CityValencia
Period4/06/125/06/12

Keywords

  • Game theory
  • Metagame
  • Multi-agent reinforcement learning
  • Nmetaq

ASJC Scopus subject areas

  • Artificial Intelligence
  • Software

Fingerprint

Dive into the research topics of 'NmetaQ: An n-agent reinforcement learning algorithm based on meta equilibrium'. Together they form a unique fingerprint.

Cite this