Questions tagged [potential-reward-shaping]

For questions about potential-based reward shaping, which was introduced in the paper "Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping" by Andrew Y. Ng et al, in 1999.

4 questions

votes

1 answer

How to improve the reward signal when the rewards are sparse?

In cases where the reward is delayed, this can negatively impact a models ability to do proper credit assignment. In the case of a sparse reward, are there ways in which this can be negated? In a chess example, there are certain moves that you can…

asked Feb 03 '21 at 18:17

tryingtolearn

votes

2 answers

What should I do when the potential value of a state is too high?

I'm working on a Reinforcement Learning task where I use reward shaping as proposed in the paper Policy invariance under reward transformations: Theory and application to reward shaping (1999) by Andrew Y. Ng, Daishi Harada and Stuart Russell. In…

reinforcement-learning papers reward-design reward-shaping potential-reward-shaping

asked May 08 '18 at 22:23

Marco Favorito

votes

1 answer

Expressing Arbitrary Reward Functions as Potential-Based Advice (PBA)

I am trying to reproduce the results for the simple grid-world environment in [1]. But it turns out that using a dynamically learned PBA makes the performance worse and I cannot obtain the results shown in Figure 1 (a) in [1] (with the same…

reinforcement-learning reward-design reward-shaping inverse-rl potential-reward-shaping

asked Mar 25 '19 at 04:26

bcxiao

votes

1 answer

Why does potential-based reward shaping seem to alter the optimal policy in this case?

It is known that every potential function won't alter the optimal policy [1]. I lack of understanding why is that. The definition: $$R' = R + F,$$ with $$F = \gamma\Phi(s') - \Phi(s),$$ where, let's suppose, $\gamma = 0.9$. If I have the following…

reinforcement-learning reward-functions reward-shaping potential-reward-shaping

asked Jan 26 '19 at 14:11

ScientiaEtVeritas