training?

Asked Sep 26 '22 at 20:25

Active Sep 26 '22 at 20:25

Viewed 109 times

I'm thinking of a situation like a game (say, chess) where the real objective/reward is actually determined at the very end.

I understand that it's important/helpful to do reward shaping with intermediate rewards, so that the agent can get clues of what is good/bad behavior leading to the final result. However, I would greatly appreciate advice or discussion about whether these intermediate rewards should be phased out over time.

For example, let's say the agent is playing chess. I imagine its helpful to give rewards/punishments for captured/lost pieces and then a BIG reward/punishment at end for victory/defeat. As training goes on, though, would you recommend decaying/removing the intermediate rewards?

asked Sep 26 '22 at 20:25

Vladimir Belik

Reinforcement Learning with sparse/delayed reward - should intermediate rewards be decayed over time/training?

0 Answers0