I understand the goals and purposes of RL in the case of a single agent and the underlying model, i.e. MDPs, for RL problems (or sequential decision making with uncertainty in general).
My question is (and I know this will/may be subjective) are the indicators for choosing to model some decision making problem as a single agent, treating all other factors/noise as part of the environment (MDP or some variant of it) vs multiple agents (a stochastic/Markov game)?
In the zero-sum/adversarial or pure cooperation setting where the goals of the agents conflict/assist each other, it is obvious that the multi-agent setting is the way to go. But suppose that there is no pure conflict/coordination.