For questions about potential-based reward shaping, which was introduced in the paper "Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping" by Andrew Y. Ng et al, in 1999.
Questions tagged [potential-reward-shaping]
4 questions
6
votes
1 answer
How to improve the reward signal when the rewards are sparse?
In cases where the reward is delayed, this can negatively impact a models ability to do proper credit assignment. In the case of a sparse reward, are there ways in which this can be negated?
In a chess example, there are certain moves that you can…

tryingtolearn
- 385
- 1
- 2
- 10
3
votes
2 answers
What should I do when the potential value of a state is too high?
I'm working on a Reinforcement Learning task where I use reward shaping as proposed in the paper Policy invariance under reward transformations:
Theory and application to reward shaping (1999) by Andrew Y. Ng, Daishi Harada and Stuart Russell.
In…

Marco Favorito
- 185
- 7
3
votes
1 answer
Expressing Arbitrary Reward Functions as Potential-Based Advice (PBA)
I am trying to reproduce the results for the simple grid-world environment in [1]. But it turns out that using a dynamically learned PBA makes the performance worse and I cannot obtain the results shown in Figure 1 (a) in [1] (with the same…

bcxiao
- 33
- 3
2
votes
1 answer
Why does potential-based reward shaping seem to alter the optimal policy in this case?
It is known that every potential function won't alter the optimal policy [1]. I lack of understanding why is that.
The definition:
$$R' = R + F,$$ with $$F = \gamma\Phi(s') - \Phi(s),$$
where, let's suppose, $\gamma = 0.9$.
If I have the following…

ScientiaEtVeritas
- 155
- 5