Questions tagged [delayed-rewards]
6 questions
9
votes
2 answers
Reinforcement Learning with asynchronous feedback
I want suggestions on literature on Reinforcement Learning algorithms that perform well with asynchronous feedback from the environment. What I mean by asynchronous feedback is, when an agent performs an action it gets feedback(reward or regret)…

papabiceps
- 191
- 6
6
votes
1 answer
How to improve the reward signal when the rewards are sparse?
In cases where the reward is delayed, this can negatively impact a models ability to do proper credit assignment. In the case of a sparse reward, are there ways in which this can be negated?
In a chess example, there are certain moves that you can…

tryingtolearn
- 385
- 1
- 2
- 10
4
votes
2 answers
How to deal with the time delay in reinforcement learning?
I have a question regarding the time delay in reinforcement learning (RL).
In the RL, one has state, reward and action. It is usually assumed that (as far as I understand it) when the action is executed on the system, the state changes immediately…

jengmge
- 41
- 1
- 2
3
votes
1 answer
Can reinforcement learning be used for tasks where only one final reward is received?
Is reinforcement learning problem adaptable to the setting when there is only one - final - reward. I am aware of problems with sparse and delayed rewards, but what about only one reward and a quite long path?

TomR
- 823
- 5
- 15
0
votes
0 answers
Should I model this problem as a POMDP?
Suppose we have a finite-horizon sequential decision-making problem. At period $t$ we are in state $s$. We take action $a$ and we receive reward $r$ and go to state $s-1$ at period $t+1$. However, it is possible with a positive probability ($p>0$)…

Amin
- 471
- 2
- 11
0
votes
1 answer
How to deal with delay in reinforcement learning, an unclear case
According to the question in How to deal with the time delay in reinforcement learning?, we can tell the delay in the reinforcement learning can be observation delay, action delay and reward delay.
I have a special case of the delay but I am not…

CharlesC
- 1
- 1