Highest Voted 'sparse-rewards' Questions - Artificial Intelligence Stack Exchange

6

votes

1 answer

How to improve the reward signal when the rewards are sparse?

In cases where the reward is delayed, this can negatively impact a models ability to do proper credit assignment. In the case of a sparse reward, are there ways in which this can be negated? In a chess example, there are certain moves that you can…

asked Feb 03 '21 at 18:17

tryingtolearn

385
1
2
10

6

votes

1 answer

What are the pros and cons of sparse and dense rewards in reinforcement learning?

From what I understand, if the rewards are sparse the agent will have to explore more to get rewards and learn the optimal policy, whereas if the rewards are dense in time, the agent is quickly guided towards its learning goal. Are the above…

reinforcement-learning comparison reward-functions sparse-rewards dense-rewards

asked Aug 13 '20 at 07:05

stoic-santiago

1,121
5
18

4

votes

2 answers

How to apply Q-learning when rewards is only available at the last state?

I have a scheduling problem in which there are $n$ slots and $m$ clients. I am trying to solve the problem using Q-learning so I have made the following state-action model. A state $s_t$ is given by the current slot $t=1,2,\ldots,n$ and an action…

reinforcement-learning q-learning reward-functions sparse-rewards combinatorial-optimization

asked Aug 27 '20 at 14:17

zdm

299
2
8

4

votes

1 answer

How does the optimization process in hindsight experience replay exactly work?

I was reading the following research paper Hindsight Experience Replay. This is the paper that introduces a concept called Hindsight Experience Replay (HER), which basically attempts to alleviate the infamous sparse reward problem. It is based on…

reinforcement-learning dqn deep-rl sparse-rewards hindsight-experience-replay

asked Mar 12 '20 at 10:19

vikram71198

91
3

3

votes

1 answer

Are there any reliable ways of modifying the reward function to make the rewards less sparse?

If I am training an agent to try and navigate a maze as fast as possible, a simple reward would be something like \begin{align} R(\text{terminal}) &= N - \text{time}\ \ , \ \ N \gg \text{everything} \\ R(\text{state})& = 0\ \ \text{if not…

reinforcement-learning rewards reward-shaping reward-design sparse-rewards

asked Jul 18 '19 at 04:12

Paradox

133
3

3

votes

1 answer

Can reinforcement learning be used for tasks where only one final reward is received?

Is reinforcement learning problem adaptable to the setting when there is only one - final - reward. I am aware of problems with sparse and delayed rewards, but what about only one reward and a quite long path?

reinforcement-learning reward-design sparse-rewards credit-assignment-problem delayed-rewards

asked May 07 '19 at 15:50

TomR

823
5
15

1

vote

1 answer

How do I compute the value function when the reward is only at the end in the context of actor-critic algorithms?

Consider the actor-critic reinforcement learning setting (actor and critic parameterized by a neural network). The reward is given only at the end of the episode (or when there is a timeout there is no reward). How could we learn the value function?…

reinforcement-learning actor-critic-methods reward-functions sparse-rewards

asked Oct 05 '21 at 08:31

cerebrou

141
1
3

0

votes

0 answers

Looking for a reinforcement learning algorithm that deals well with a model-based, curiosity-driven approach for chess AI

I am a software engineer that meddled with machine learning (classifiers) during my thesis. After being out of it for a while I decided I want to try and do a neural network project to learn from, specifically reinforcement learning. We'll see how…

reinforcement-learning deep-learning model-based-methods sparse-rewards artificial-curiosity

asked Mar 11 '23 at 22:33

NG.

119
6

0

votes

0 answers

How does Proximal Policy Optimization deal with sparse reward

In the original paper, the objective of PPO is as follows:. My question is, how does this objective behave in a sparse reward setting (i.e., reward is only given after a sequence of actions were taken)? In this case we don't have $\hat{A}_{t}$…

proximal-policy-optimization sparse-rewards

asked Mar 04 '23 at 02:33

Sam

175
5

0

votes

0 answers

Reinforcement Learning with sparse/delayed reward - should intermediate rewards be decayed over time/training?

I'm thinking of a situation like a game (say, chess) where the real objective/reward is actually determined at the very end. I understand that it's important/helpful to do reward shaping with intermediate rewards, so that the agent can get clues of…

reinforcement-learning reward-design sparse-rewards

asked Sep 26 '22 at 20:25

Vladimir Belik

342
2
12

Questions tagged [sparse-rewards]