When should I use an MARL approach instead of training one agent while keep the others fixed?

Asked Jun 02 '22 at 21:12

Active Jun 04 '22 at 18:40

Viewed 41 times

I have built a custom multi-agent environment with PettingZoo, where a turn-based game with two agents, A and B, is setup.

I want to examine situations where malicious behavior may arise, given the game rules, and I am looking into training approaches.

To do that, I have implemented a deterministic policy as a baseline / control.

Fixing agent A to that baseline policy, I want to subsequently train agent B and observe the resulting behaviors.

After B arrives at a desirable behavioral pattern, I want to train agent A to see how it responds to B's actions.

Having the above setting in mind:

Is the above training approach, which keeps one agent fixed and trains the other, correct?

Should I follow a MARL approach for training instead, or is the above approach that encapsulates one agent as part of the environment sound?

In general, what are requirements / desiderata to look for that hint that a MARL approach is the correct way and/or a separate training scheme is erroneous?

edited Jun 04 '22 at 18:40

nbro

39,006
12
98
176

asked Jun 02 '22 at 21:12

npit

I've tried to rewrite the title to summarize your question/problem. Make sure that's correct!! – nbro Jun 04 '22 at 18:40

When should I use an MARL approach instead of training one agent while keep the others fixed?

0 Answers0