Is there a multi-agent deep reinforcement learning algorithm which is for environments with only discrete action spaces (Not hybrid)?

Question

Is there a multi-agent deep reinforcement learning algorithm which is for environments with only discrete action spaces (Not hybrid) and have centralized training?

I have been looking for algorithms, (A2C, MADDPG etc.) but still havent find any algorithm that provides all of properties i mentioned (Multi agent + discrete action space + deep learning + centralized training).

I am wondering if we use an actor network that gets state as input and concatenated discrete actions of agents as output (For example if agent has 3 actions and we have 4 agents output can be [0,0,1, 0,1,0, 0,0,1, 1,0,0]) is that would be bad idea ?

I havn't done RL in a while, but isn't the point of multi agent the fact that they act on their own and thus are disctinct networks ? Why would you use multi agent if you merge them into one network agent ? I remember [Unity ML-Agents](https://github.com/Unity-Technologies/ml-agents/blob/main/docs/ML-Agents-Overview.md) plugin providing this kind of multi agent learning, maybe you could have a look at how they do it. — Ubikuity, May 17 '21 at 13:01
@Ubikuity if the agents are homogeneous, then you can have a common NN policy that is executed decentralized but have centralized training. Also, the last time I worked with Unity ML-Agents they didn't have a direct implementation of MARL algorithms. As I remember they have the usual RL PPO and SAC algorithms, although they can work in simple MARL implementations. — Felipe Costa, May 18 '21 at 16:45

score 1 · Answer 1 · answered May 18 '21 at 17:04

A natural policy to act in an environment with discrete action space would be a softmax.

This paper describes a method that uses the idea of centralized training, and I believe could be used in your implementation.

With regard to your last question, I don't know if i understood, but if you have a system that must perform 3 actions, you could assign each action to a specific agent (assuming we have three different action spaces). Then you would have a cooperation game with 3 agents, where all of them have a common reward function. In theory, this 3 agents represents an individual agent that interacts with the environment.

Is there a multi-agent deep reinforcement learning algorithm which is for environments with only discrete action spaces (Not hybrid)?

1 Answers1