Can anyone recommend a reinforcement learning algorithm for a multi-agent environment?
In my simplified example, I'm implementing a Q-Learning system with different 10 agents. The agents compete for resources in stores at different locations by setting a bid price for each item.
All of the agents have different bids and pooled budget of $100. Once the budget is reached the agents cannot buy any more that day.
Each agent will receive a reward if they buy an item. The goal would be to maximize the total amount of items bought between the agents.
Right now the agents don't communicate.
Can someone point me in the right direction for an algorithm that allows agent cooperation?