Papers in Multi-Agent Reinforcement Learning(MARL)
This is my paper lists about Multi-Agent Reinforcement Learning.
What makes this list outstanding?
There is introduction part(or called comment) based my understanding of the papers(if there is some objective mistakes, thanks a lot if you can tell me!).
There is score part to help you quickly find papers that may enlight and accelerate your learning.
-
PS:
- "Score" is range from 1 to 5.The higer score is, the more useful the paper is(i.e. 5 means the higest quanlity and useful to study).
- Note that the point is based on only my personal view.
Book and Reviews
Title | Introduction | Score |
---|---|---|
Reinforcement Learning: state of the art | A comprehensive review including POMDP and Bayesian RL | 5 |
POMDP solution methods | A concise and detailed introduction to POMDP | 4 |
A Concise Introduction to Decentralized POMPDs | A newbie-friendly and comprehensive book to dec-POMPDs | 4 |
A Comprehensive Survey of Multi-agent Reinforcement Learning | An top scope to MARL, inconlusive and comprehensive! | 5 |
Markov Decision Process in Artificial Intelligence and CS294-Sequential Decisions: Planning and Reinforcement Learning | Detailed MDP and beyond MDP | 4 |
Multi-agent Systems:Algorithmic, Game-Theoretic, and Logic Foundations | From the view of game theory, not deep reinforcement learning | 3 |
Deep Dec-POMDPs
Title | Introduction | Score |
---|---|---|
Multiagent Cooperation and Competition with Deep Reinforcement Learning | The first paper looks at MADRL after dqn? | 3 |
Deep Recurrent Q-Learning for Partially Observable MDPs | Dqn has problem: observation != state | 4 |
Cooperative Multi-Agent Control Using Deep Reinforcement Learning | 3 schemes extend DQN、DDPG、TRPO from sing-agent to multi-agent;code avaiable | 4 |
Value-Decomposition Networks for Cooperative Multi-Agent Learning | The first paper apply decomposition in MADRL | 4 |
QMIX: Monotonic Value Function Fatorisation for Deep Multi-agent Reinforcement Learning | Based VDN, more flexible to decomposition global Q | 4 |
Opponent Modeling
Title | Introduction | Score |
---|---|---|
Modeling Others using Oneself in Multi-agent Reinforcement Learning | Using opponent goal as addtional input | 3 |
Learning Policy Representations in Multi-agent Systems | Using policy representation to cluser, classify and RL(using opponent's embedding as addtional input) | 4 |
Communication
Title | Introduction | Score |
---|---|---|
Emergence of Grounded Compositional Language in Multi-Agent Populations | ||
Learning to Communicate with Deep Multi-Agent Reinforcement Learning | Communicate discrete action | 4 |
Learning Multiagent Communication with Backpropagation | Communicate hidden state | 3 |