Questions tagged [reward-clipping]

For questions related to the act of clipping values of rewards observed in a problem (typically a Markov Decision Process) to a limited range (often limiting rewards to the range [-1, 1]). This is sometimes done in an effort to stabilize learning processes (most notably in the DQN algorithm and related algorithms in Reinforcement Learning problems such as the Atari games).

3 questions

votes

1 answer

Should the reward or the Q value be clipped for reinforcement learning

When extending reinforcement learning to the continuous states, continuous action case, we must use function approximators (linear or non-linear) to approximate the Q-value. It is well known that non-linear function approximators, such as neural…

asked Oct 10 '18 at 23:45

Rui Nian

votes

2 answers

What is the main difference between additive rewards and discounted rewards?

What is the difference between additive and discounted rewards?

reinforcement-learning comparison rewards reward-clipping

asked Dec 09 '18 at 08:29

Marosh Fatima

vote

0 answers

Deciding the rewards for different actions in Pong for a DQN agent

I am attempting to implement an agent that learns to play in the Pong environment, the environment was created in PyGame and I return the pixel data and score at each frame. I use a CNN to take a stack of the last 4 frames as input and predicts the…

reinforcement-learning dqn rewards reward-clipping

asked Apr 13 '19 at 16:57

RMMD12