For questions about multi-objective reinforcement learning (MORL).
Questions tagged [multi-objective-rl]
5 questions
12
votes
3 answers
Why is the reward in reinforcement learning always a scalar?
I'm reading Reinforcement Learning by Sutton & Barto, and in section 3.2 they state that the reward in a Markov decision process is always a scalar real number. At the same time, I've heard about the problem of assigning credit to an action for a…

Sid Mani
- 223
- 1
- 4
3
votes
2 answers
Can rewards be decomposed into components?
I'm training a robot to walk to a specific $(x, y)$ point using TD3, and, for simplicity, I have something like reward = distance_x + distance_y + standing_up_straight, and then it adds this reward to the replay buffer. However, I think that it…

pinkie pAI
- 35
- 3
2
votes
1 answer
Can the rewards be matrices when using DQN?
I have a basic question. I'm working towards developing a reward function for my DQN. I'd like to train an RL agent to edit pixels on an image. I understand that convolutions are ideal for working with images, but I'd like to observe the agent doing…

junfanbl
- 323
- 1
- 7
2
votes
1 answer
What are preferences and preference functions in multi-objective reinforcement learning?
In RL (reinforcement learning) or MARL (multi-agent reinforcement learning), we have the usual tuple:
(state, action, transition_probabilities, reward, next_state)
In MORL (multi-objective reinforcement learning), we have two more additions to the…

Huan
- 161
- 1
- 6
2
votes
1 answer
What are some simple open problems in multi-agent RL that would be suited for a bachelor's thesis?
I've decided to make my bachelor thesis in RL. I am currently struggling to find a good problem. I am interested in multi-agent RL with the dilemma between selfishness and cooperation.
I only have 2 months to complete this and I'm afraid that…

Rom
- 139
- 4