The episode length increases at the start till it reaches a peak then decreases. What can cause this unexpected behavior?

Asked Nov 22 '22 at 11:39

Active Nov 22 '22 at 11:39

Viewed 24 times

I am running the A3C algorithm to evaluate a policy based on a policy gradient method. I observe an unexpected behavior at the start of the episode in the reward and episode length. As shown in the figure blew, at the start of the training the episode length increases and it starts to decrease after a peak and become normal behavior. For the reward, it decreases at the start and then increaes. May I ask what can cause this kind of unexpected behavior in the episode ?

asked Nov 22 '22 at 11:39

Salwa Mostafa

The episode length increases at the start till it reaches a peak then decreases. What can cause this unexpected behavior?

0 Answers0